创造未来的想法
Ideas That Created the Future
计算机科学经典论文
Classic Papers of Computer Science
哈里·R·刘易斯编辑
edited by Harry R. Lewis
麻省理工学院出版社
马萨诸塞州剑桥
英国伦敦
The MIT Press
Cambridge, Massachusetts
London, England
© 2021 Harry R. Lewis,每章第一页底部另有说明的除外。
© 2021 Harry R. Lewis, except as stated at the bottom of the first page of each chapter.
版权所有。未经出版商书面许可,不得以任何电子或机械方式(包括复印、录音或信息存储和检索)以任何形式复制本书的任何部分。
All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher.
美国国会图书馆出版数据编目
Library of Congress Cataloging-in-Publication Data
姓名:刘易斯,哈利·R.,编辑。
Names: Lewis, Harry R., editor.
标题:创造未来的想法:计算机科学经典论文/由 Harry R. Lewis 编辑。
Title: Ideas that created the future: classic papers of computer science / edited by Harry R. Lewis.
描述:马萨诸塞州剑桥:麻省理工学院出版社,[2021] | 包括参考书目和索引。
Description: Cambridge, Massachusetts: The MIT Press, [2021] | Includes bibliographical references and index.
标识符:LCCN 2020018950 | ISBN 9780262045308(平装本)
Identifiers: LCCN 2020018950 | ISBN 9780262045308 (paperback)
主题: LCSH:计算机科学。
Subjects: LCSH: Computer science.
分类:LCC QA76.I34 2020 | DDC 004–dc23
Classification: LCC QA76.I34 2020 | DDC 004–dc23
LC记录可在https://lccn.loc.gov/2020018950获取
LC record available at https://
d_r0
d_r0
对于那些带我去新地方的古怪而多疑的老师:
To those quirky and skeptical teachers who took me new places:
菲尔·布里吉斯、吉姆·马奎尔、范·埃利奥特、德斯蒙德·奥格雷迪、
Phil Bridgess, Jim Maguire, Van Elliott, Desmond O’Grady,
希拉·格雷巴赫、汤姆·奇塔姆、伊万·萨瑟兰和伯特·德雷本
Sheila Greibach, Tom Cheatham, Ivan Sutherland, and Burt Dreben
“我可以告诉你的未来。没有什么比这更容易的了。……但是谁能告诉你你的过去呢?……这是什么意思?它想对你说什么?”
——桑顿·怀尔德,《牙齿的皮肤》
“I can tell your future. Nothing easier. … But who can tell your past? … What did it mean? What was it trying to say to you?”
—Thornton Wilder, The Skin of Our Teeth
图 2.1:二进制到十进制的转换和二进制和,来自莱布尼茨的De Progresse dyadica (1679)。
Figure 2.1: Binary to decimal conversion and a binary sum, from Leibniz’s De progressione dyadica (1679).
图 7.1:预期的乘法速度
Figure 7.1: Expected multiplication speed
图 8.1:阻碍功能符号
Figure 8.1: Symbol for hindrance function
图 8.2:加法的解释
Figure 8.2: Interpretation of addition
图 8.3:乘法的解释
Figure 8.3: Interpretation of multiplication
图 8.4:命题演算和符号中继分析之间的类比
Figure 8.4: Analogue between the calculus of propositions and the symbolic relay analysis
图 8.5:关于一个变量的展开
Figure 8.5: Expansion about one variable
图 8.6:要简化的电路
Figure 8.6: Circuit to be simplified
Figure 8.7: Simplification of Figure 8.6
图 9.1:神经元c i总是在细胞体上标有数字i ,相应的动作用“ N ”表示, i为下标,如文中所示。
Figure 9.1: The neuron ci is always marked with the numeral i upon the body of the cell, and the corresponding action is denoted by “N” with i as subscript, as in the text.
图 10.1:“初稿”打字稿的标题页
Figure 10.1: Title page of “First draft” typescript
图 10.2:时钟脉冲
Figure 10.2: Clock pulses
图 10.3:EDVAC“命令”初稿(说明)
Figure 10.3: First draft EDVAC “orders” (instructions)
图 12.1:一般通信系统的示意图。
Figure 12.1: Schematic diagram of a general communication system.
图 12.2:电报符号约束的图形表示。
Figure 12.2: Graphical representation of the constraints on telegraph symbols.
图 12.3:与示例 B 中的源相对应的图表。
Figure 12.3: A graph corresponding to the source in example B.
图 12.4:与示例 C 中的源相对应的图表。
Figure 12.4: A graph corresponding to the source in example C.
图 12.5:与示例 D 中的源相对应的图表。
Figure 12.5: A graph corresponding to the source in example D.
图 12.6:对三种可能性的选择进行分解。
Figure 12.6: Decomposition of a choice from three possibilities.
图 12.7:概率为p和 (1 − p )的两种可能性情况下的熵。
Figure 12.7: Entropy in the case of two possibilities with probabilities p and (1 − p).
图 16.1:操作
Figure 16.1: An operation
图 16.2:问题的解决方案
Figure 16.2: Solution of problem
图 16.3:U NIVAC系统。[编辑:UNITYPER 是打字机输入设备,UNISERVO 是磁带驱动器。]
Figure 16.3: The UNIVAC system. [EDITOR: UNITYPER was a typewriter input device, UNISERVO a magnetic tape drive.]
图 16.4:问题的解决方案。
Figure 16.4: Solution of a problem.
图 16.5:编译例程和子例程。
Figure 16.5: Compiling routines and subroutines.
图 16.6:编译 B 类和任务例程。
Figure 16.6: Compiling Type B and task routines.
图 16.7:计算系统。
Figure 16.7: Computing system.
图 16.8:操作
Figure 16.8: Operation
图 16.9:示例 [编辑器:aaL=“添加到限制”(Ash 等人,1957 年,第 86 页)。]
Figure 16.9: Example [EDITOR: aaL = “add to a limit” (Ash et al., 1957, page 86).]
Figure 16.10: Variable table for Figure 16.9
图 18.1:感知器的组织。
Figure 18.1: Organization of a perceptron.
图 22.1:H-LAM/T 系统的两侧
Figure 22.1: The Two Sides of the H-LAM/T System
图 23.1:服务与活跃用户数量
Figure 23.1: Service vs. number of active users
图23.2:多级调度算法流程图
Figure 23.2: Flow chart of the multi-level scheduling algorithm
图 24.1:六角形图案。
Figure 24.1: Hexagonal pattern.
图 24.2:TX-2 操作区域——使用中的画板。显示屏上可以看到一座桥的一部分……。作者手持光笔。用于控制特定绘图功能的按钮位于作者前面的盒子上。在作者身后可以看到部分拨动开关。显示屏上看到的总图像部分的大小和位置是通过桌子上方的四个黑色旋钮获得的。
Figure 24.2: TX-2 operating area—Sketchpad in use. On the display can be seen part of a bridge …. The Author is holding the Light pen. The push buttons used to control specific drawing functions are on the box in front of the Author. Part of the bank of toggle switches can be seen behind the Author. The size and position of the part of the total picture seen on the display is obtained through the four black knobs just above the table.
图 24.3:与画板一起使用的绘图仪。数字和模拟控制系统使绘图仪可以在 TX-2 的直接控制下或通过打孔纸带离线绘制直线和圆。
Figure 24.3: Plotter used with Sketchpad. A digital and analog control system makes the plotter draw straight lines and circles either under direct control of the TX-2 or off-line from punched paper tape.
图 24.4:直线和圆的绘制。
Figure 24.4: Line and circle drawing.
图 24.5:说明性示例。
Figure 24.5: Illustrative example.
图 24.6:连杆的四个位置。数字表示虚线的长度。
Figure 24.6: Four positions of linkage. Number shows length of dotted line.
图 24.7:同一格子上的半六边形和半圆形。
Figure 24.7: Half hexagons and semicircles on same lattice.
图 32.1:结构 1. 隶属于各部分的项目
Figure 32.1: Structure 1. Projects subordinate to parts
图 32.2: 结构 2. 项目所属部分
Figure 32.2: Structure 2. Parts subordinate to projects
图 32.3: 结构 3. 零件和项目是对等的,承诺关系从属于项目
Figure 32.3: Structure 3. Parts and projects as peers, commitment relationship subordinate to projects
图 32.4: 结构 4. 零件和项目对等,承诺关系从属于零件
Figure 32.4: Structure 4. Parts and projects as peers, commitment relationship subordinate to parts
图 32.5:结构 5. 零件、项目和同等的承诺关系
Figure 32.5: Structure 5. Parts, projects, and commitment relationship as peers
图 32.6:4 阶关系
Figure 32.6: A relation of degree 4
图 32.7:具有两个相同域的关系
Figure 32.7: A relation with two identical domains
图 32.8:非标准化集
Figure 32.8: Unnormalized set
图 32.9:归一化集
Figure 32.9: Normalized set
Figure 32.10: A permuted projection of the relation in Figure 32.6
图 32.11:两个可连接的关系
Figure 32.11: Two joinable relations
Figure 32.12: The natural join of R with S (from Figure 32.11)
Figure 32.13: Another join of R with S (from Figure 32.11)
图 33.1:为内部操作提供小型计算机程序的实施步骤。
Figure 33.1: Implementation steps to deliver a small computer program for internal operations.
图 33.2:开发大型计算机程序以交付给客户的实施步骤。
Figure 33.2: Implementation steps to develop a large computer program for delivery to a customer.
图 33.3:希望各个阶段之间的迭代交互仅限于连续的步骤。
Figure 33.3: Hopefully, the iterative interaction between the various phases is confined to successive steps.
图 33.4:不幸的是,对于所示的过程,设计迭代从来不限于连续的步骤。
Figure 33.4: Unfortunately, for the process illustrated, the design iterations are never confined to the successive steps.
图 33.5:步骤 1:确保在分析开始之前完成初步程序设计。
Figure 33.5: Step 1: Insure that a preliminary program design is complete before analysis begins.
图 33.6:步骤 2:确保文档是最新且完整的——至少需要六份独特不同的文档。
Figure 33.6: Step 2: Insure that documentation is current and complete—at least six uniquely different documents are requireds.
图 33.7:第 3 步:尝试执行两次该工作 - 第一次结果提供了最终产品的早期模拟。
Figure 33.7: Step 3: Attempt to do the job twice—the first result provides an early simulation of the final product.
图 33.8:步骤 4:计划、控制和监控计算机程序测试。
Figure 33.8: Step 4: Plan, control, and monitor computer program testing.
图 33.9:步骤 5:让客户参与——参与应该是正式的、深入的和持续的。
Figure 33.9: Step 5: Involve the customer—the involvement should be formal, in-depth, and continuing.
图 33.10:总结。
Figure 33.10: Summary.
图 36.1:完整的问题
Figure 36.1: Complete problems
图 38.1:典型的分组交换网络
Figure 38.1: Typical packet switching network
图 38.2:通过两个网关互连的三个网络
Figure 38.2: Three networks interconnected by two GATEWAYS
图 38.3:互联网数据包格式(字段未按比例显示)
Figure 38.3: Internetwork packet format (fields not shown to scale)
图 38.4:TCP 地址
Figure 38.4: TCP address
图 38.5:从消息创建段和数据包
Figure 38.5: Creation of segments and packets from messages
图 38.6:段格式(进程标题和文本)
Figure 38.6: Segment format (process header and text)
图 38.7:序列号的分配
Figure 38.7: Assignment of sequence numbers
图 38.8:互联网络标头标志字段
Figure 38.8: Internetwork header flag field
图38.9:消息分割和数据包分割
Figure 38.9: Message splitting and packet splitting
图 38.10:窗口概念
Figure 38.10: The window concept
图 38.11:概念 TCB 格式 [编辑:“TCB”= 发送控制块。]
Figure 38.11: Conceptual TCB format [EDITOR: “TCB” = transmit control block.]
图 40.1:时间与工人数量的关系——完全可划分的任务。
Figure 40.1: Time versus number of workers—perfectly partitionable task.
图 40.2:时间与工作人员数量 — 不可分区的任务。
Figure 40.2: Time versus number of workers—unpartitionable task.
图 40.3:时间与工作人员数量 — 需要通信的可分区任务。
Figure 40.3: Time versus number of workers—partitionable task requiring communication.
图 40.4:时间与工人数量的关系——具有复杂相互关系的任务。
Figure 40.4: Time versus number of workers—task with complex interrelationships.
图 41.2:两段以太网
Figure 41.2: A two-segment Ethernet
图 41.3:以太网数据包布局
Figure 41.3: Ethernet packet layout
图41.4:碰撞控制算法
Figure 41.4: Collision control algorithm
图 41.5:EFTP 数据包布局
Figure 41.5: EFTP packet layout
图 42.1:传统密码系统中的信息流
Figure 42.1: Flow of information in conventional cryptographic system
图42.2:公钥系统中的信息流
Figure 42.2: Flow of information in public key system
图 42.3:用作单向函数的安全密码系统
Figure 42.3: Secure cryptosystem used as one-way function
图44.1:验证者的原始类比
Figure 44.1: The verifiers’ original analogy
图 44.2:我们的类比
Figure 44.2: Our analogy
本书用创造计算机科学领域的语言讲述了计算机科学的故事。该系列的诞生有两个原因。首先,为了消除 21 世纪读者的错误印象,即该领域的既定惯例已以完成的形式传承给当代文化。计算机科学有着丰富的家族历史,该领域的学生和从业者应该了解这一点。其次,帮助读者了解新想法的重要性,试探性、笨拙的婴儿步伐如何变得优雅、进步——或者有时多年来毫无进展,然后在长时间延迟后再次采取。该领域仍然年轻且充满活力;今天环顾四周的任何人都可能会看到一些新奇的东西,这些新奇的东西将成为明天的经典,在其起步阶段几乎无法辨认。
This volume tells the story of computer science in the words that created the field. The collection came into being for two reasons. First, to relieve 21st-century readers of the misimpression that the established conventions of the field were handed down to contemporary culture in finished form. Computer science has a rich family history that should be known to students and practitioners of the field. And second, to help readers see how important new ideas come to be, how tentative, clumsy baby steps become graceful, progressive strides—or sometimes go nowhere for years and then are taken up again after a long delay. The field is still young and dynamic; anyone looking around it today might see some novelty that will be tomorrow’s canon, hardly recognizable in its infancy.
为了讲述这个故事,我怀着极大的焦虑挑选了从最早到 1980 年的 46 篇论文,并为每一篇论文都附上了一篇简短的背景文章。这些论文中的每一篇都对该领域做出了令人难忘的贡献。除这些之外或代替这些之外,可能还选择了许多其他作品,而且 1980 年的截止日期是任意的——尽管它确实代表了这个领域变得如此多样化的时刻,以至于无法在像这样的小收藏中进行总结。
To tell this story I have selected, with considerable anxiety, 46 papers from the earliest times to 1980, and have introduced each with a brief context-setting essay. Each of these papers made a memorable contribution to the field. Many others might have been selected in addition to or instead of these, and the 1980 cut-off date is arbitrary—though it does represent a moment when the field became so diversified as to defy summation in a small collection like this one.
本书是一本记录该领域起源的教育书籍,既不是论文的批评版本,也不是该领域的历史。引言将论文置于其历史背景中,但对于那些寻求对历史发展进行更彻底和细致入微的描述的人来说,Priestley (2011) 成功地避免了过度依赖对主体的自利回忆和事后故技谬误。对于计算机器的早期历史,读者可以参考 Pratt (1987) 和 Jones (2016)。
This book is an educational volume documenting the origins of the field, and neither a critical edition of the papers nor a history of the field. The Introduction sets the papers in their historical context, but for those seeking a more thorough and nuanced account of historical developments, Priestley (2011) successfully avoids both over-reliance on self-interested recollections of the principals and post hoc ergo propter hoc fallacies. For the early history of computing machines, the reader is referred to Pratt (1987) and Jones (2016).
本书适合针对研究生或高年级本科生的一学期课程,并已在哈佛大学和麻省理工学院用于此目的。它还可以为好奇的专业人士提供导游服务。选择和编辑论文时考虑的因素包括:
This volume is suitable for a one-semester course aimed at graduate students or advanced undergraduates, and has been used for that purpose at both Harvard and MIT. It can also serve as a guided tour for a curious professional. Among the factors considered when selecting and editing papers were these:
1. 许多论文都被大量摘录,既是为了重点关注关键贡献,也是为了省略不再令人感兴趣的技术细节。例如,我省略了图灵通用机器代码的丑陋细节。省略的文本在全文中使用省略号 ( … )。不用说,渴望更多细节的读者应该找到完整的论文。每章都以论文的参考书目开始,尽管参考书目引用中的年份可能与标题中包含的年份不匹配,因为有些论文首先是口头提出的,或者是在首次发表后进行修改的。此外,论文中记载的发明或发现可能早于发表日期。
1. Many papers are heavily excerpted, both to focus on the key contributions and to omit technical details no longer of great interest. For example, I have omitted the ugly details of Turing’s code for his universal machine. Omitted text is indicated throughout by use of an ellipsis (…). Needless to say, readers hungry for more detail should track down the full papers. Each chapter begins with the bibliographic reference for the paper, though the year in the bibliographic citation may not match the year included with the title, since some papers were presented orally first or revised after their first publication. Moreover, the invention or discovery documented in the paper may have occurred earlier than the publication date.
2. 我更喜欢简短易读的论文,而不是冗长或困难的论文,无论多么重要。
2. I preferred short and readable papers to long or difficult papers, no matter how important.
3. 我只收录了论文,而不是书籍摘录(布尔的《思想法则》和布鲁克斯合集《人月神话》中的标题文章除外)。
3. I included only papers, not book extracts (with the exception of Boole’s The Laws of Thought and the title essay from Brooks’s collection The Mythical Man-Month).
4. 我并没有尝试浓缩主要编程语言(例如 FORTRAN 、 COBOL或ALGOL)的定义报告,尽管它们对于该领域的发展很重要。
4. I did not attempt to condense the defining reports for major programming languages such as FORTRAN, COBOL, or ALGOL, important though they were to the development of the field.
5. 早期将该领域系统化的重要努力也被省略,例如课程 68(Atchison 等人,1968 年)和 Jean Sammet(1972 年)的编程语言历史。
5. Important early efforts to systematize the field are also omitted, for example Curriculum 68 (Atchison et al., 1968) and the history of programming languages by Jean Sammet (1972).
6. 所包含的作者的页数并不能衡量他们作为当时计算机科学家的重要性。按照这一标准,唐纳德·高德纳 (Donald Knuth) 的代表性不足,而埃兹格·迪克斯特拉 (Edsger Dijkstra) 的代表性可能过高;巴克斯、乔姆斯基、丘奇、弗洛伊德、格雷、克莱恩、纽厄尔、兰波特、兰普森、拉宾、斯科特和塔里安等伟大人物都完全失踪了。向那些最喜欢的论文或科学家未能入选的人致歉!
6. The page count of authors included cannot be taken as a measure of their importance as computer scientists of the period. Donald Knuth would be under-represented by that standard and Edsger Dijkstra perhaps over-represented; great names such as Backus, Chomsky, Church, Floyd, Gray, Kleene, Newell, Lamport, Lampson, Rabin, Scott, and Tarjan are missing entirely. Apologies to anyone whose favorite paper or scientist did not make it in!
还有一些纯粹的编辑细节。
And a few purely editorial details.
感谢2019年哈佛CS191和2020年麻省理工学院6.S897的学生们的精心校对。Brian Sapozhnikov和Adham Meguid尤其目光敏锐。还要感谢 Peter Denning、Bill Gasarch、Warren Goldfarb、Matthew Lena、Maryanthe Malliaris、Tasha Schoenstein、Lloyd Strickland、Sherry Turkle 和 Joel Wachman 的有益评论和更正。当然,任何剩余的错误完全由我负责。
For their careful proofreading I am grateful to the students of Harvard CS191 in 2019 and MIT 6.S897 in 2020. Brian Sapozhnikov and Adham Meguid were especially eagle-eyed. Thanks also to Peter Denning, Bill Gasarch, Warren Goldfarb, Matthew Lena, Maryanthe Malliaris, Tasha Schoenstein, Lloyd Strickland, Sherry Turkle, and Joel Wachman for their helpful comments and corrections. Of course, any remaining errors are solely my responsibility.
哈里·刘易斯
Harry Lewis
2020年7月
July 2020
第一种计算是数值计算;成堆的结石(鹅卵石)是早期会计师的工具。后来的符号和机械发明被天空观察者以及军事和海军探险工程师所使用。随着物理世界受到人类的理解和控制,特别是在启蒙运动时期和后来的战争时期,对计算的需求不断增强。
THE first computing was numerical calculation; piles of calculi (pebbles) were tools of early accountants. Later notational and mechanical inventions were put to use by observers of the skies and by engineers of military and naval adventures. The demand for calculation intensified as the physical world was subjected to human understanding and control, especially during the Enlightenment, and later in wartime.
但计算机科学的智力根源不仅仅是簿记、天文学和弹道学。计算机科学是逻辑、数学科学和人类想象力的产物。由于其混合的知识渊源以及与自然世界唯一的间接联系,该领域在二十世纪的大部分时间里都在为合法性而奋斗。当计算现象在科学、工程、经济学以及数学中变得无处不在时,关于计算机科学或任何其他“人工科学”是否是语义战争的能量已经耗尽(Simon,1996),可以合理地称为“科学”。该领域被接受为一门科学要归功于二十世纪中叶的教育先驱者,他们为学院和大学设计了第一批课程和课程,常常不顾其诞生的数学或工程系的抵制。我们不能在这里讲述他们的故事,但公平地说,如果教育系统化者以不同的方式组织该领域,那么对该领域起源的重述可能会非常不同。
But the intellectual roots of computer science are not merely bookkeeping, astronomy, and ballistics. Computer science is the child of logic, the mathematical sciences, and the human imagination. Because of its mixed intellectual parentage and its only indirect connection to the natural world, the field struggled for legitimacy through much of the twentieth century. By the time computational phenomena became ubiquitously significant in science, engineering, and economics, as well as in mathematics, the energy had drained from the semantic wars about whether computer science, or any other “science of the artificial” (Simon, 1996), could legitimately be called a “science.” That the field became accepted as a science is to the credit of mid-twentieth century education pioneers who designed the first courses and curricula for colleges and universities, often against resistance from the departments of mathematics or engineering from which it was emerging. We cannot tell their story here, but it is fair to note that this retelling of the genesis of the field might have been very different had the educational systematizers organized the field differently.
逻辑自古以来就潜伏在族谱中,仅与计算进行笨拙和偶然的接触,直到十九世纪中叶它成为元数学的工具。算法在二十世纪的前几十年被具体化,作为元数学家计划的一部分,以确定什么数学可以被认为是真实的。科学和商业计算刺激了机械计算器设计的一系列渐进式改进,并从二战期间对物理和工程学的独特需求中得到了重要推动。
Logic lurked in the family tree since ancient times, making only clumsy and episodic contact with calculation until, in the mid-nineteenth century, it became a tool of metamathematics. Algorithms were reified in the first decades of the twentieth century as part of the program of metamathematicians to determine what mathematics can be known to be true. And scientific and business calculation, which had stimulated a long series of incremental improvements in mechanical calculator design, got an important boost from the unique demands placed on physics and engineering during World War II.
一路上,一个傲慢的想法不断渗入严肃的工作中:人类能否为他们正在建造的机器注入生命?随着计算机科学的成熟,这种神话般的愿景围绕着计算数学形成了一个越来越紧的漩涡,预示着奇点的融合。随着合成视觉、言语和灵活性质量的不断提高,关于其对个人和社会的影响的争论仍在继续。
All along the way a hubristic idea kept nosing its way into the serious work: might human beings breathe life into the machines they were building? As computer science matured, that mythic vision wound around the mathematics of computation in a tightening gyre, portending fusion in a singularity. With continued improvements in the quality of synthetic vision, speech, and dexterity, the debate continues about the consequences for individuals and societies.
诚然,计算机科学的诞生日期是任意的。我们留给该领域的史前著作,例如三世纪的丢番图算术(Diophantus,1910)和九世纪的花剌子米代数(al Khwārizmī,1915),尽管两者在智力上都非常出色。(丢番图确实在本书的第五章中客串了一次,因为他研究了一些数值问题,这些问题一旦被推广,结果证明是递归无法解决的。)
To be sure, the birthdate of computer science is arbitrary. We leave to the field’s prehistory works such as the Arithmetic of Diophantus in the third century (Diophantus, 1910) and the Algebra of al-Kwarizmi in the ninth century (al Khwārizmī, 1915), intellectually remarkable though both are. (Diophantus does make a cameo appearance in chapter 5 of this volume, for having studied numerical questions which, once generalized, turned out to be recursively unsolvable.)
但我们必须从亚里士多德开始,他给了我们这样一个概念:命题可以包含代表属性的变量(第一章)。无论变量的实例化如何,这样的命题都可能为真,或者仅有时为真,或者永远不为真。这种逻辑分析与计算机科学具有特定的相关性。数字计算机之所以具有“通用性”,是因为相同的二进制逻辑元素在不同的时间可以表示不同的事物。当在一个程序中使用时表示一天中的时间的存储寄存器可能在下一时刻在不同的程序中表示街道地址。亚里士多德让我们认识到固定的逻辑规则可以应用于不同的现象——逻辑提供了推理的一般框架。
But we have to start with Aristotle, who gave us the notion that propositions can include variables representing properties (chapter 1). Such a proposition might be true regardless of the instantiation of the variables, or might be true only sometimes, or never. Such logical analysis has a specific relevance to computer science. What makes a digital computer “general-purpose” is that the same binary logic elements can signify different things at different times. A storage register that signifies time of day when it is used in one program may at the next instant signify street address in a different program. To Aristotle we owe the realization that fixed rules of logic can apply to different phenomena—that logic provides a general framework for reasoning.
十七世纪初,开普勒制定了行星轨迹的数学定律,笛卡尔将几何简化为代数,帕斯卡用数学术语描述了流体。由于可观测的物理现象是用数学公式描述的,计算弯曲形状的面积和体积的古老问题具有更大的实际重要性,尤其是因为随着光学的进步,天体测量变得更加精确。连续量数学的进步为英国的艾萨克·牛顿和欧洲大陆的戈特弗里德·莱布尼茨几乎同时发明了无穷小微积分奠定了基础。
Early in the seventeenth century Kepler formulated mathematical laws for the trajectories of the planets, Descartes reduced geometry to algebra, and Pascal characterized fluids in mathematical terms. Since observable physical phenomena were being described by mathematical formulas, the ancient problems of calculating areas and volumes of curved shapes assumed greater practical importance, all the more so because, with advances in optics, celestial measurements became more precise. The advances in the mathematics of continuous quantities laid the groundwork for the nearly simultaneous invention of the calculus of infinitesimals by Isaac Newton in England and by Gottfried Leibniz on the Continent.
莱布尼茨是一位熟练的计算器。帕斯卡在观察了他父亲在税收核算方面的工作后,发明了一种巧妙的机械加法机。莱布尼茨扩展了该设备以执行乘法和除法,从而构建了第一个嵌套循环计算引擎(但请参见第 62 页)。早在其他人看到二进制记数法的实用性之前,莱布尼茨就理解并写下了二进制记数法的优点。即使是工作了 250 多年的先驱霍华德·海瑟薇·艾肯 (Howard Hathaway Aiken) 也只是不情愿地放弃了十进制算术。最重要的是,除了发明无穷小微积分之外,莱布尼茨还构思了思想微积分。第二章只是关于人类事务合理化的众多乌托邦愿景之一,莱布尼茨并没有在执行他的计划方面走得太远。他意识到,在 13 世纪,拉蒙·鲁尔 (Ramon Llull) 建造了一台机器来执行某些三段论推理,莱布尼茨指出,一旦他的演算完成,人类“将拥有一种能够提升理性的工具,其作用不亚于望远镜对理性的作用”。完美的视觉。” 几乎两个世纪后,他的微积分运算所用的逻辑原子再次出现在乔治·布尔的著作中(见下文),并已成为 Prolog 和 Datalog 等逻辑编程语言的基本事实。
Leibniz was a skilled calculator. Pascal had developed an ingenious mechanical adding machine, having observed his father’s labors with accounting for tax collections; Leibniz extended the device to carry out multiplications and divisions—thus building one of the first nested-loop calculating engines (but see page 62). Leibniz understood and wrote about the merits of binary notation long before its utility was seen by others; even the pioneer Howard Hathaway Aiken, working more than 250 years later, gave up decimal arithmetic only reluctantly. Most importantly, in addition to inventing the calculus of infinitesimals, Leibniz conceived of a calculus of ideas. Chapter 2 is only one of many utopian visions about rationalizing human affairs, and Leibniz did not get very far toward executing his plan. He was aware that in the thirteenth century, Ramon Llull had built a machine to perform certain syllogistic inferences, and Leibniz notes that once his calculus is complete, humankind “will have an instrument which will serve to exalt reason no less than the Telescope serves to perfect vision.” The logical atoms on which his calculus would operate reappear almost two centuries later in the work of George Boole (see below) and have become the base facts of logic programming languages such as Prolog and Datalog.
查尔斯·巴贝奇 (Charles Babbage) 的分析机设计(第 3 章)可能标志着计算机时代的到来,因为该设备既可编程又适应性强。但巴贝奇未能建造它。尽管人们恳求机器快要完成了,但他还是不断地花光钱——1835 年,巴贝奇(Babbage,1989,第 245 页)写道,“这项发明的最大困难是已经被克服了,计划将在几个月内完成”——这对国防很重要——“分析引擎对所有重大天文问题的控制,海军的安全很大程度上取决于这些问题几乎不会不让女王陛下对这个主题产生兴趣,”他写信给阿尔伯特亲王(巴贝奇,1843)。尽管如此,巴贝奇的学徒艾达·洛夫莱斯(Ada Lovelace)通过推理这台永远无法工作的机器,能够掌握各种可识别的现代编程概念。
Charles Babbage’s design for his Analytical Engine (chapter 3) might have marked the arrival of the age of computers, as the device would have been both programmable and adaptable. But Babbage could not get it built. He kept running out of money, in spite of pleas that the machine was almost done—in 1835, Babbage (1989, p. 245) wrote, “the greatest difficulties of the invention have already been surmounted, and that the plans will be finished in a few months”—and was important to the national defense—“the control of the Analytical Engine over all the great Astronomical questions on which the safety of the Navy so much depends can scarcely fail to impart to the subject an interest in the mind of Her Majesty,” he wrote to Prince Albert (Babbage, 1843). Nonetheless, Babbage’s apprentice Ada Lovelace was able to grasp a variety of recognizably modern programming concepts by reasoning about this never-functional machine.
乔治·布尔(George Boole,1854 年,此处为第 4 章)有不同的议程。布尔比巴贝奇大三,大约 25 岁,没有接受过大学教育,他至少与巴贝奇有过一次接触,但他自己的目标是让亚里士多德跟上时代的步伐,以数学形式捕捉人类理性的规则。他在逻辑方面的工作因脱离主流而更具创新性,并且对计算的影响有限,直到 20 世纪 30 年代克劳德·香农 (Claude Shannon) 使其成为设计数字电路的重要工具。
George Boole (1854, here chapter 4) had a different agenda. About 25 years Babbage’s junior and not university-educated, Boole did cross paths with Babbage at least once, but his own goal was to bring Aristotle up to date, to capture the rules of human reason in mathematical form. His work on logic was the more innovative for being out of the mainstream, and its influence on computing was limited—until the 1930s, when Claude Shannon made it an essential tool for designing digital circuits.
到了二十世纪之交,逻辑学家正在使用数学本身的方法将证明的概念数学化,并寻求完成莱布尼茨的逻辑议程。大卫·希尔伯特 (David Hilbert) 的重大挑战,即他的 Entscheidungsproblem(确定形式化的数学陈述能否被证明),是在他 1900 年在国际数学家大会上发表著名演讲(此处摘录为第五章)几年后就明确提出的。但是“数学问题”都预示着机械化逻辑以其对有限方法的吸引力,并传达了莱布尼茨式的乐观主义,即如果世界上的数学家足够努力,他们的房子就会井然有序。
By the turn of the twentieth century logicians were using the methods of mathematics itself to mathematize the idea of proof and were seeking to complete Leibniz’s logical agenda. David Hilbert’s grand challenge, his Entscheidungsproblem—to determine whether formalized mathematical statements could be proved—was posed crisply only some years after his famous 1900 address to the International Congress of Mathematicians, excerpted here as chapter 5. But “Mathematical Problems” both foreshadows mechanized logic with its appeal to finitary methods and conveys a Leibnizian optimism that if the world’s mathematicians work hard enough, their house will be put in order.
事实并非如此,至少不是希尔伯特想象的那样。首先是哥德尔,然后是丘奇,然后是图灵(1936 年,此处为第 6 章)阐明了一个在 20 世纪之前无法想象的元数学世界。尽管每个人都对计算机科学产生了重大影响,但图灵的贡献是最重要的,因为(a)它令人信服地形式化了计算机的概念,从而使证明有关计算机的事物是可能的想法;(b) 它令人信服地具体化了计算的普遍性,并表明由简单组件制成的设备可以实现这一点;以及(c)将(a)和(b)与无可辩驳的逻辑结合起来,以证明可计算性是有限的。
It was not to be, at least not as Hilbert had imagined it. First Gödel, then Church, and then Turing (1936, here chapter 6) illuminated a metamathematical world that could not have been imagined before the twentieth century. Though each had a significant impact on computer science, Turing’s contribution was the most important, because (a) it convincingly formalized the idea of a computing machine, and thus the idea that it was possible to prove things about them; (b) it persuasively reified computational universality and showed that devices made of simple components could achieve it; as well as (c) combining (a) and (b) with irrefutable logic to prove that there were limits to the computable.
图灵对希尔伯特的Entscheidungsproblem的拆解是他论文的感叹号。证明过程中的一个重要技术技巧是使用自动机的数据存储单元来存储其他自动机的程序。为了完成他的论证,图灵随后回顾了康托 1891 年的对角化论证——最初是为了展示实数的不可数性这一截然不同的目的而设计的(Cantor,1996)。十年后,埃克特和莫奇利将类似的存储程序想法融入到他们的 ENIAC 计算机的修改中,尽管他们这样做纯粹是出于实际原因,不受图灵理论工作的影响——将程序存储在真空管存储器中加快了机器的速度并使得程序更容易更改(Priestley,2011,第 125 页)。当约翰·冯·诺依曼 (John von Neumann) 于 1945 年与埃克特 (Eckert) 和莫奇利 (Mauchly) 合作设计 EDVAC(ENIAC 的后继机器)时,他负责撰写设计说明(第 10 章),存储程序设计从此为人所知,过于慷慨,如“冯诺依曼架构”。同样的想法大约在同一时间被用于曼彻斯特“Baby”以及不久之后的图灵的ACE。图灵(Turing,1945,第 3 页)在他的提案中引用了 EDVAC 报告,但存储程序似乎是那些在时机成熟时几乎同时在多个地方出现的科学思想之一。
Turing’s demolition of Hilbert’s Entscheidungsproblem is the exclamation point of his paper. An important technical trick in the middle of the proof uses the data storage unit of an automaton to store the programs of other automata. To complete his argument, Turing then reached back to Cantor’s 1891 diagonalization argument—originally devised for the very different purpose of showing the uncountability of the reals (Cantor, 1996). A decade later, Eckert and Mauchly incorporated a similar stored-program idea into a modification of their ENIAC computer, though they did so for purely practical reasons, uninfluenced by Turing’s theoretical work—storing the program in vacuum tube memory sped up the machine and made the program easier to change (Priestley, 2011, p. 125). When John von Neumann teamed up with Eckert and Mauchly in 1945 to design the EDVAC, a successor machine to the ENIAC, it fell to him to write the design notes (chapter 10), and the stored-program design has ever since been known, too generously, as the “von Neumann architecture.” The same idea was used in the Manchester “Baby” about the same time and Turing’s ACE soon after. In his proposal Turing (1945, page 3) cites the EDVAC report, but the stored program seems to have been one of those scientific ideas that emerge almost simultaneously in more than one place when the time is ripe.
但回到 20 世纪 30 年代。当图灵搬到普林斯顿继续他的逻辑研究时,应用数学家霍华德·海瑟薇·艾肯(Howard Hathaway Aiken,1964 年,此处为第 7 章)正在哈佛大学研究数值计算的老问题,但该问题尚未取得太大进展。自莱布尼茨以来提出。艾肯设计了巨大的机电 Mark I 来打印数学函数值的表格——其中一些函数,如贝塞尔函数,直接源自 18 世纪的数学分析传统,还有一些函数,例如弹道轨迹,以满足现代战争的要求。他的项目最终与他的合作伙伴 IBM 发生了一场争论,争论的焦点是这台机器究竟是谁的——这个问题类似于有关巴贝奇的问题:荣耀属于计算机的设计者,还是属于构建计算机并使其运行的人? ?
But back to the 1930s. While Turing was moving to Princeton to further his studies of logic, the applied mathematician Howard Hathaway Aiken (Aiken et al., 1964, here chapter 7) was working at Harvard on the old problem of numerical computing, on which not much progress had been made since Leibniz. Aiken designed the massive, electromechanical Mark I to print tables of the values of mathematical functions—some of them, like the Bessel functions, straight out of the eighteenth century tradition of mathematical analysis, and some of them, ballistic trajectories for example, to meet the demands of modern warfare. His project wound up in a dispute with his partner, IBM, over whose machine it really was—a question akin to the one about Babbage: Does the glory go to the designer of a computer or to the people who build it and get it running?
艾肯并不是 20 世纪 30 年代末唯一一个思考自动计算的人。康拉德·祖斯 (Konrad Zuse) 在柏林研究自己的机电计算器,约翰·文森特·阿塔纳索夫 (John Vincent Atanasoff) 在爱荷华州独立开发一台电子机器,尽管在宾夕法尼亚大学陆军工作的埃克特和莫奇利当然知道阿塔纳索夫的工作,这是一种情况这后来导致了激烈的专利诉讼。
Aiken was not alone in thinking about automatic calculation in the late 1930s. Konrad Zuse was working on his own electromechanical calculators in Berlin, and John Vincent Atanasoff was developing an electronic machine in Iowa, all independently—though Eckert and Mauchly, working for the Army at the University of Pennsylvania, certainly knew about Atanasoff’s work, a circumstance that later resulted in bitter patent litigation.
当计算在几个不同的地方发出微小的孵化声时,电话技术在各地呈爆炸式增长。将多个开关连接在一起以产生相同功能结果的方法不止一种,聪明的工程师掌握了缩小所需硬件的艺术。当克劳德·香农(Claude Shannon,1938 年,此处为第 8 章)作为麻省理工学院研究生开始研究这些问题时,他意识到他在本科哲学课程中学到的布尔思维定律也是电路定律。一旦电路被转换为布尔逻辑,逻辑公式就可以被简化,然后转换回更经济的电路。这些方法对于数字计算机的设计非常重要,并且至今仍在使用。
As computing was making small hatching sounds in several separate places, telephony was exploding everywhere. There was more than one way to wire several switches together to produce the same functional result, and clever engineers mastered the art of shrinking the required hardware. When Claude Shannon (1938, here chapter 8) started working on these problems as an MIT graduate student, he realized that Boole’s Laws of Thought, which he knew from an undergraduate philosophy course, were also the laws of circuits. Once a circuit had been translated into boolean logic, the logic formula could be simplified and then translated back into a more economical circuit. These methods assumed great importance for the design of digital computers, and are still in use to this day.
二进制表示对于电气处理有许多优点。布尔逻辑不仅可以用来操作复杂的表达式,而且如果只有两个可能的值,一个电压代表 0,另一个代表 1,更容易将降级的信号恢复到其真实值。众所周知,逻辑的机械化在 20 世纪 40 年代加速了。Eckert-Mauchly-von Neumann 团队在 EDVAC 设计中使用了二进制(von Neumann,1993,此处第 10 章),并对二进制算术算法进行了一些最早的分析。香农发表了第二篇具有里程碑意义的论文,将通信工程与二进制数据表示联系起来,并在此过程中定义了“位”(香农,1948 年,此处为第 12 章)。Hamming(1950 年,此处第 13 章)提出了一种向二进制数据添加额外位的通用方法,以便可以检测到错误,并在某些情况下,在数据传输中出现乱码后进行纠正。
Binary representation has many advantages for electrical processing. Not only can boolean logic be used to manipulate complicated expressions, but it is easier to restore a degraded signal to its true value if there are only two possible values, one voltage to represent 0 and the other to represent 1. With the binary system firmly accepted, the mechanization of logic accelerated during the 1940s. The Eckert–Mauchly–von Neumann team used binary in their design for the EDVAC (von Neumann, 1993, here chapter 10), and carried out some of the earliest analysis of algorithms for binary arithmetic. Shannon published a second landmark paper, tying communications engineering to binary data representation, and defining “bit” along the way (Shannon, 1948, here chapter 12). And Hamming (1950, here chapter 13) presented a general method for adding extra bits to binary data so that errors could be detected and, under some circumstances, corrected after data was garbled in transit.
尽管很重要,但所有这些发展在某种程度上都是渐进的。存储程序架构有许多父母和祖先,香农将他的工作归功于奈奎斯特和当时的其他通信工程师。神经科学家 Warren McCulloch 和自学成才的逻辑学家 Walter Pitts(McCulloch 和 Pitts,1943 年,此处为第 9 章)的工作完全不同。以前从未发表过类似的文章。文艺复兴时期的大量文献将人体比作杠杆组件,而在十九世纪,这种类比扩展到将人体的能量使用与蒸汽机的能量使用联系起来。思维是一种计算形式的概念早于莱布尼茨,至少可以追溯到托马斯·霍布斯(Thomas Hobbes,1655,第 2 页):“我所说的‘推理’是指计算”(per Ratiocinationem autem intelligoComputationem )。但是,将大脑本身解释为一种特定的机制是一件新鲜事,将神经元的全有或全无发射与电路的切换联系起来,从而与怀特海和怀特海的复杂逻辑演算联系起来,就更加大胆了。罗素(1910)。麦卡洛克和皮茨不仅仅是玩弄这个类比;他们宣称人类思维的奥秘已被解决——除了细节需要稍后解决。他们的工作是冯·诺依曼计算机设计报告初稿(第 10 章)中唯一引用的工作。神经网络发展成为一种重要的计算模型,尽管麦卡洛克-皮茨工作的细节大部分已被取代。更现实的神经模型的关键重启是弗兰克·罗森布拉特(Frank Rosenblatt,1958a,此处为第 18 章)关于感知器的论文,尽管今天的神经网络在计算机科学理论中的生命力很大程度上与其神经解剖学起源脱节。
Important as they were, all these developments were to some degree incremental. The stored program architecture had many parents and ancestors, and Shannon gives due credit for his work to Nyquist and other communications engineers of the day. The work of neuroscientist Warren McCulloch and self-taught logician Walter Pitts (McCulloch and Pitts, 1943, here chapter 9) was altogether different. Nothing like it had ever been published before. A robust Renaissance literature likened the human body to assemblies of levers, and in the nineteenth century the analogy expanded to connect the energy usage of the body to that of the steam engine. The notion that thinking was a form of calculation precedes Leibniz, going back at least to Thomas Hobbes (1655, p. 2): “by ‘reasoning,’ I mean computation” (per ratiocinationem autem intelligo computationem). But to explain the brain itself as a specific kind of mechanism was a new thing, and it was even more audacious to connect the all-or-nothing firing of neurons to the switching of electric circuits and thereby to the elaborate logical calculus of Whitehead and Russell (1910). McCulloch and Pitts did not merely toy with this analogy; they declared the mysteries of the human mind solved—except for details to be worked out later. Their work was the only one cited in von Neumann’s first draft report on computer design (chapter 10). Nerve nets evolved into an important computing model, though the details of the McCulloch–Pitts work have mostly been supplanted. The crucial reboot of a more realistic neural model was the paper of Frank Rosenblatt (1958a, here chapter 18) on perceptrons, though nerve nets today have a life in computer science theory largely divorced from their neuroanatomical origins.
机器可以像人类一样行动的相互想法在神话中根深蒂固。在《伊利亚特》中,荷马描述了神圣铁匠赫菲斯托斯的仆人机器人:“他把风箱从火上移开……,手里拿起一根沉重的棍子,一瘸一拐地走到门口。并感动了他的侍从来支持他们的主人。它们是金色的,看起来就像活生生的年轻女子。他们的心中有智慧,他们有言语和力量,他们从不朽的神灵那里学会了如何做事。这些人灵活地搅拌以支持他们的主人”(Homer,1962,18.412ff.)。早在公元前八世纪,这段文字就描述了人工智能的几个子领域:通用智能、学习、言语、灵活性。布尔似乎在 1862 年的一次著名会议上与他讨论了巴贝奇的“思考引擎”,尽管洛夫莱斯夫人警告不要指望分析引擎能够进行原创思考。但秘密已经泄露了。不久之后,塞缪尔·巴特勒 (Samuel Butler) 在《机器中的达尔文》一书中预计机器可以进化为智能(巴特勒,1863 年,最初匿名出版)。
The reciprocal idea, that machines could be made to act like human beings, runs very deep in mythology. In the Iliad, Homer described servant automata of the divine blacksmith Hephaestus: “He set the bellows away from the fire …, took up a heavy stick in his hand, and went to the doorway limping. And in support of their master moved his attendants. These are golden, and in appearance like living young women. There is intelligence in their hearts, and there is speech in them and strength, and from the immortal gods they have learned how to do things. These stirred nimbly in support of their master” (Homer, 1962, 18.412ff.). Already in the eighth century BCE, this passage described several subfields of artificial intelligence: general intelligence, learning, speech, dexterity. Boole seems to have discussed Babbage’s “thinking engine” with him during their one known meeting in 1862, though Lady Lovelace had cautioned against expecting the Analytical Engine to be capable of original thought. But the cat was out of the bag; soon after, Samuel Butler, in “Darwin among the machines,” anticipated that machines could evolve to become intelligent (Butler, 1863, originally published anonymously).
到 20 世纪 40 年代末,图灵已经能够将他对计算机的了解(不仅仅是理论——他一直在设计和制造计算机器)与几个世纪以来关于思维机器的推测和现代英国分析哲学实践相结合。结果是“计算机器和智能”(图灵,1950 年,此处为第 14 章),图灵在其中设想,作为形而上学思维计算机的思想实验替代品,一台机器可以成功地欺骗人类审讯者,让他们相信机器就是人类。从那时起,图灵对反论点(包括洛夫莱斯夫人的)的冷静剖析激发了争论和回应。
By the late 1940s, Turing was in a position to combine what he knew about computers (not just the theory—he had been designing and building computing machines) with the centuries-old speculations about thinking machines and the modern British practice of analytic philosophy. The result was “Computing machinery and intelligence” (Turing, 1950, here chapter 14), in which Turing imagined, as a thought-experiment alternative to the metaphysical thinking computer, a machine that could successfully fool a human interrogator into believing that the machine was human. Turing’s dispassionate dissection of the counter-arguments (including Lady Lovelace’s) has stimulated debate and response ever since.
从 1964 年开始,Joseph Weizenbaum(1966 年,此处为第 27 章)编写了一个非常原始的程序,可以与人类进行无内容的聊天,只需将人类使用过的单词粘贴到计算机响应中语法正确的位置即可。魏森鲍姆将这个被他称为 E LIZA的程序视为关于有限语言处理的技术演示。它并不是为了进行重要的对话而设计的,但人们愿意像与人类同胞一样参与其中,这引发了有关人机交互的重大道德问题。诺伯特·维纳(Norbert Wiener,1960 年,此处为第 19 章)已经向计算机专业人员提出了挑战,要求他们考虑其工作的道德含义,同时思考自动化的影响和玩游戏的计算机的技能,即不假思索地将决策分配给计算机的后果。应该被人类保留。
Starting in 1964, Joseph Weizenbaum (1966, here chapter 27) wrote a very primitive program that held content-free chats with humans, simply pasting words the human had used into syntactically correct positions in the computer’s responses. Weizenbaum saw the program, which he dubbed ELIZA, as a technical demonstration about limited language processing. It was not intended for significant conversation, but the willingness of people to engage with it as they would with fellow humans raised significant ethical questions about human–computer interactions. Norbert Wiener (1960, here chapter 19) had already challenged computer professionals to consider the ethical implications of their work while contemplating the implications of automation and the skill of game-playing computers—the consequences, that is, of thoughtlessly assigning to computers decisions that should be retained by humans.
20 世纪 50 年代可以说是计算机设计的寒武纪时代。详细的 EDVAC 设计文件为学院和营利部门的变化和改进以及新存储和开关元件的实验打开了大门。莫里斯·威尔克斯 (Maurice Wilkes) 1952 年发明的微码(Wilkes,1981 年,此处为第 15 章)被作为一个例子,以简单的形式出现了一个重要的想法,以响应眼前的需求:随着改进的进行,不断重新连接计算机变得太烦人了到指令集。晶体管化和集成电路带来的逻辑元件数量的急剧增加引发了疯狂的指数增长,该增长以戈登·摩尔(Gordon Moore,1965 年,此处第 25 章)在杂志文章中命名,其中漫画描绘了家用计算机,早在任何人都没有意识到这一点之前。想象一下这样的设备有什么用途。
The 1950s were something of a Cambrian era in computer design. The detailed EDVAC design document opened the floodgates for variations and improvements, and for experiments with new storage and switching elements, in both the academy and the for-profit sector. Maurice Wilkes’s 1952 invention of microcode (Wilkes, 1981, here chapter 15) is included as an example of the emergence of an important idea in simple form in response to immediate needs: it was just getting too tiresome to keep rewiring computers as improvements were made to the instruction set. The dramatic increase in the number of logic components resulting from transistorization and then integrated circuits set in motion the crazy exponential growth named after Gordon Moore (1965, here chapter 25) in a magazine article with a cartoon depiction of a home computer well before anyone was imagining what use such a device could have.
Grace Hopper 在其他人意识到这一点之前就认识到,硬件很快就会成为计算机系统中最便宜的部分,因为计算机是一次性购买的,但新的软件投资可以永远持续下去(Hopper,1952 年,此处第 16 章) 。此外,她明白高级语言不仅是一种便利,而且是必需的,因为拥有代码库的人无法在每次购买新计算机时从头开始重写它。
Grace Hopper recognized, before it was apparent to others, that hardware would soon become the cheapest part of a computer system, because the computer was a one-time purchase but new software investments could go on forever (Hopper, 1952, here chapter 16). Moreover, she understood that higher-level languages were not merely a convenience but a necessity, since no one with a code base could afford to rewrite it from scratch every time a new computer was acquired.
FORTRAN 、LGOL和 CO OBOL都在 20 世纪 50 年代出现,CO OBOL感谢 Hopper,他认识到了商业计算的重要性。我们不能在这里公正地对待任何这些编程语言或其创造者,除了 John McCarthy 敢于采用 Alonzo Church 的 lambda 演算,该演算是为了解决 Entscheidungs 问题而开发的,作为一种函数式编程语言(McCarthy,1960 年,此处为第 21 章) )。在这样做的过程中,他将逻辑传统直接与计算机编程艺术联系起来:如果将计算视为符号操作是理解可计算的局限性的关键,那么为什么不将符号操作视为实用编程语言中的原语呢?麦卡锡还将函数定义的递归风格令人信服地引入了计算机编程的中心,这种风格是在二十世纪初的元数学中建立的,并由 Rósza Péter (1951) 系统化。
FORTRAN, ALGOL, and COBOL all emerged during the 1950s, COBOL thanks to Hopper, who recognized the importance of business computing. We cannot here do justice to any of these programming languages or their creators, except to John McCarthy for daring to adapt Alonzo Church’s lambda-calculus, developed to put the Entscheidungsproblem to rest, as a functional programming language (McCarthy, 1960, here chapter 21). In doing so he connected the logical tradition directly to the art of computer programming: if viewing computation as symbol manipulation was key to understanding the limits of the computable, why not take symbol manipulation as primitive in a practical programming language? McCarthy also brought the recursive style of function definition, which had been established in the metamathematics of the early twentieth century and systematized by Rósza Péter (1951), convincingly into the center of computer programming.
阿朗佐教堂的影响力怎么强调都不为过。他在普林斯顿大学的博士生包括图灵、迈克尔·拉宾和达纳·斯科特;麦卡锡在教会时代也是普林斯顿大学的博士生,他负责人工智能的创立、第一个分时系统(Corbató et al., 1962,此处第 23 章)以及符号和函数编程。
The influence of Alonzo Church cannot be overstated. His PhD students at Princeton included Turing, Michael Rabin, and Dana Scott; McCarthy, who was also a PhD student at Princeton in the Church era, is responsible for the founding of artificial intelligence, for the first time-sharing system (Corbató et al., 1962, here chapter 23), and for symbolic and functional programming.
计算机系统作为人类记忆和思维助手(而不仅仅是计算)的工程可以追溯到 Vannevar Bush(1945a,此处第 11 章)在一本流行杂志上发表的文章“正如我们可能思考的那样”。他试图想象未来技术如何帮助人类思维,但并没有明确认为计算机会有所帮助。在接下来的十年里,大型笨重的计算机开始出现,有远见的人开始想象人类有一天可以更优雅地与它们一起工作的方式。JCR Licklider(1960 年,此处为第 20 章)设想了人类与计算机之间的合作,而 Doug Engelbart(1962 年,此处为第 22 章)致力于将其变为现实。Ivan Sutherland(1963 年,此处为第 24 章)通过他在麻省理工学院的博士论文戏剧性地开创了计算机图形学领域,布什在大约 20 年前就曾在麻省理工学院想象过一个思想助理。
The engineering of computer systems as assistants to human memory and thought, not just calculation, can be traced to the publication by Vannevar Bush (1945a, here chapter 11), in a popular magazine, of the article “As We May Think.” He tried to imagine how human thought might be assisted by technology in the future, but did not have any clear notion that a computer would be helpful. Over the next decade large, clunky computers began to appear, and visionaries began to imagine ways in which human beings might someday work more gracefully with them. J. C. R. Licklider (1960, here chapter 20) imagined cooperation between humans and computers, and Doug Engelbart (1962, here chapter 22) worked to make it real. Ivan Sutherland (1963, here chapter 24) dramatically launched the field of computer graphics with his PhD thesis at MIT, where Bush had imagined a thought assistant almost twenty years earlier.
对描述和分析计算的形式、数学、抽象方法的研究有很多线索。几个世纪以来,纸笔算法一直存在更好和更差的情况,Lovelace 和 Aiken 对机械可编程计算器的描述建议做出某些编程选择以提高性能或准确性。图算法在第二次世界大战期间作为运筹学的一部分进行研究,并发展成为计算机科学的一个主要子领域。Kruskal(1956,此处第 17 章)提出了现在几乎每个计算机科学学生都在学习的最小生成树算法,Edmonds(1965)在讨论最大匹配时明确讨论了算法效率。
The study of formal, mathematical, abstract methods to describe and analyze computations has many threads. There had been better and worse paper and pencil algorithms for centuries, and the descriptions of mechanical programmable calculators by Lovelace and Aiken suggest making certain programming choices to improve performance or accuracy. Graph algorithms were studied as part of operations research during the World War II years, and evolved into a major subfield of computer science. Kruskal (1956, here chapter 17) presented the minimum spanning tree algorithm now learned by virtually every student of computer science, and Edmonds (1965) talked explicitly about algorithmic efficiency while discussing maximum matching.
随着计算机迫使精确的表达和编程提出新的挑战,算法领域及其效率有机增长。从积极的一面来看,Volker Strassen 发现了针对矩阵乘法和求逆的老问题的令人惊叹的原创算法(Strassen,1969 年,此处为第 30 章),而 Edsger Dijkstra(1965 年,此处为第 26 章)则证明解决了并发控制中的一个棘手问题。消极的一面是,斯蒂芬·库克(Stephen Cook,1971b,此处为第 34 章)将图灵证明程序可以用逻辑公式描述到𝒩 𝒫和命题逻辑的具体情况,从而提出了尚未解决的𝒫 = 𝒩 𝒫问题;和理查德·卡普(Richard Karp,1972,此处为第 36 章)表明,表现出组合爆炸的多种已知问题本质上是同一问题的变体。Knuth(1976 年,此处第 43 章)提出了现在几乎被普遍接受的用于比较算法计算复杂性的符号,并且在一个同样令人愉快的注释中(Knuth,1974b)提出了 𝒫 和 𝒩 𝒫的术语,其中科学界已达成共识。
The field of algorithms and their efficiency grew organically as computers forced precise articulation and programming presented new challenges. On the positive side, Volker Strassen discovered stunningly original algorithms for the old problems of matrix multiplication and inversion (Strassen, 1969, here chapter 30) and Edsger Dijkstra (1965, here chapter 26) provably solved a tricky problem in concurrency control. On the negative side, Stephen Cook (1971b, here chapter 34) adapted Turing’s proof that programs could be described by logical formulas to the concrete case of 𝒩𝒫 and propositional logic, thereby posing the as yet unsolved 𝒫 = 𝒩𝒫 problem; and Richard Karp (1972, here chapter 36) showed that a great variety of known problems exhibiting combinatorial explosion were essentially variations on the same problem. Knuth (1976, here chapter 43) proposed the notation that is now almost universally accepted for comparing the computational complexity of algorithms—and also, in an equally delightful note (Knuth, 1974b), proposed the terminology for 𝒫 and 𝒩𝒫 on which the scientific community has agreed.
随着算法抽象化的不断推进,更加正式和抽象地处理实际程序的动力也在不断增强,即使这意味着放弃编程语言的全部表达能力。因此 Dijkstra(1968a,此处为第 29 章)提出完全摆脱分支和跳跃,这是一个现在已被普遍接受的激进提议;Hoare(1969,此处第 31 章)建议将程序视为逻辑公式,并接受形式验证;和德米洛等人。(1979 年,此处为第 44 章)强烈反对核查议程。Liskov 和 Zilles(1974 年,此处为第 39 章)还提议将相同的抽象标准应用于已被接受为适合控制的数据,从而引发了一种最终导致面向对象编程的趋势。
As the abstraction of algorithms proceeded, so did the impetus to treat actual programs more formally and abstractly, even if that meant giving up the full expressive power of programming languages. So Dijkstra (1968a, here chapter 29) proposed getting rid of branches and jumps entirely, a radical proposal that is now generally accepted; Hoare (1969, here chapter 31) proposed treating programs like logic formulas, subject to formal verification; and DeMillo et al. (1979, here chapter 44) pushed back forcefully against the verification agenda. Also Liskov and Zilles (1974, here chapter 39) proposed applying the same abstraction standards to data that had been accepted as appropriate for control, initiating a trend that culminated in object-oriented programming.
一条线连接了最早的分时操作系统,由 Corbató 等人提出。(1962 年,此处为第 23 章),到 Dijkstra(1968b,此处为第 28 章)和 U NIX (Ritchie 和 Thompson,1974 年,此处为第 37 章)的“THE”多道程序系统,该系统现在存在许多变体。
A line connects the earliest time-sharing operating system, by Corbató et al. (1962, here chapter 23), to the “THE” multiprogramming system of Dijkstra (1968b, here chapter 28) and UNIX (Ritchie and Thompson, 1974, here chapter 37), which now exists in many variants.
如此大型的软件系统的编写和运行变得越来越麻烦。在 20 世纪 60 年代和 1970 年代,一些耗资数百万美元的项目在发布后不久就花费数百万美元进行修复,或者不得不完全废弃。软件工程作为实践者的艺术而出现,特别是两个经典治疗的主题:Royce(1970,这里第 33 章)和 Brooks(1995,最初发表于 1975 年,这里第 40 章)。每个程序员都应该阅读这两本。
Such large software systems became increasingly cumbersome to write and to get working. In the 1960s and 1970s, some multimillion-dollar projects cost millions more to fix shortly after release, or had to be junked entirely. Software engineering emerged as a practitioner’s art, the subject of two classic treatments in particular: Royce (1970, here chapter 33) and Brooks (1995, originally published in 1975, here chapter 40). Every programmer should read both.
大数据集同样需要新的方法。Codd(1970,此处为第 32 章)定义了关系模型,该模型在概念上很优雅,但需要复杂的软件才能变得实用。它现在处于数据管理行业的核心。现在,从大型文本数据库中进行检索已被视为 Web 搜索的一个方面,但这是一个古老的信息检索问题。Karen Spärck Jones(1972 年,此处第 35 章)发现了术语相关性的有用原则,平衡文档中的频率(这往往表明相关性)与整个语料库中的频率(这表明单词不具有独特性,因此不是有用的)索引键)。
Large data sets likewise required new methods. Codd (1970, here chapter 32) defined the relational model, which is conceptually elegant but required software sophistication to become practical. It is now at the core of the data management industry. Retrieval from large text databases is now taken for granted as an aspect of Web search, but it is an old information retrieval problem. Karen Spärck Jones (1972, here chapter 35) discovered useful principles for relevance of terms, balancing frequency in a document (which tends to suggest relevance) against frequency in the entire corpus (which suggests that a word is not distinctive and hence not a useful indexing key).
网络在 20 世纪 70 年代爆发。Cerf 和 Kahn(1974 年,此处第 38 章)列出了互联网协议,与最初的描述相比,今天几乎没有变化,而 Metcalfe 和 Boggs(1976 年,此处第 41 章)描述了局域网的以太网协议。接受这两个协议作为非专有标准使得世界各地的计算机可以相互通信。
Networking exploded in the 1970s. Cerf and Kahn (1974, here chapter 38) lay out the internet protocols, little changed today from their original description, and Metcalfe and Boggs (1976, here chapter 41) describe the Ethernet protocols for local area networks. The acceptance of these two protocols as nonproprietary standards has made it possible for computers everywhere to communicate with each other.
无处不在的互连需要更好的保密性。加密是一种在两方之间秘密传送消息的古老技术,但传统方法在互联网上的使用有限,因为加密/解密密钥必须通过与消息相同的不安全通道来共享。Diffie 和 Hellman(1976a,此处为第 42 章)提出的解决方案震惊了整个社区(事实证明,拉尔夫·默克尔和英国情报机构 GCHQ 的成员部分地预见到了该解决方案)。里维斯特等人。(1978 年,此处第 45 章)提供了关键的支持数学;他们的算法如今被广泛使用,尽管其安全性建立在未经证实的基础上。当前人们对量子计算兴趣的爆发在很大程度上是由于量子计算机可以用来破解 RSA 代码的可能性(Shor,1999)。Shamir(1979 年,此处第 46 章)是本卷的最后一篇论文,是关于以需要一定程度的合作才能恢复的方式共享秘密的瑰宝;这是一个无需计算机就能表述的问题,仅使用高中数学即可解决,但只有在计算机时代才有意义。
Ubiquitous interconnection required better secrecy. Encryption was an old technology for conveying messages in secret between two parties, but traditional methods were of limited use on the internet because the encryption/decryption key would have to be shared through the same insecure channel as the message. Diffie and Hellman (1976a, here chapter 42) stunned the community by proposing a solution (which, it turned out, had been partially anticipated by Ralph Merkle and by members of GCHQ, the British intelligence service). Rivest et al. (1978, here chapter 45) supplied the crucial enabling mathematics; their algorithm is widely used today, even though its security rests on unproven foundations. The current explosion of interest in quantum computing is in no small part due to the possibility that quantum computers could be used to break the RSA codes (Shor, 1999). Shamir (1979, here chapter 46), the last paper in this volume, is a gem on sharing secrets in a way that requires a certain level of cooperation for recovery; it is a problem that can be stated without reference to computers and uses only high school mathematics to solve, and yet makes sense only in the computer age.
计算机科学中的一些想法是如此熟悉,以至于很难记住它们曾经是新的。逻辑的观念就是其中之一。古代世界的计算方法对当今计算机的设计影响有限,但二值逻辑的原理支撑着数字计算。
Some ideas in computer science are so familiar that it is hard to remember that they were once new. The idea of logic is one. Calculating methods from the ancient world had limited influence on the design of today’s computers, but the principles of two-valued logic underpin digital computing.
计算机是通用的。大多数下棋计算机或库存计算机只是可用于游戏或商业的通用计算机。某一天指示 f3 方格上的骑士是白色还是黑色的一点可能会在第二天指示这本书是否有货。嵌入计算机硬件中的逻辑规则可用于操作这两种信息。这些事物、属性和推理逻辑规则的抽象概念并不总是存在。这些是亚里士多德的思想,它们是计算事物及其属性所必需的第一步。
Computers are general-purpose. Most chess-playing computers or inventory-keeping computers are just generic computers that can be used for games or for business. A bit that one day indicates whether the knight on square f3 is white or black might the next day indicate whether this book is in stock. The logical rules embedded in the computer’s hardware can be used for manipulating both kinds of information. These abstract ideas of things, properties, and logical rules for reasoning about them did not always exist. These were Aristotle’s ideas, and they were necessary first steps toward computations about things and their properties.
亚里士多德(公元前 384-322 年)是一位伟大的系统化专家。在许多大部分已失传的作品中,他对所有能想象到的事物进行了分析和分类。他的《先前分析》提出了世界上第一个逻辑系统。其目的是从前提中推断出结论,这种方式仅取决于论证的形式,而不取决于说话者的说服力或前提中未提及的任何内容。亚里士多德对逻辑演绎的解释是所有现代逻辑的根源。
Aristotle (384–322 BCE) was the great systematizer. In many works, mostly lost, he analyzed and categorized everything imaginable. His Prior Analytics presented the world’s first system of logic. Its purpose is to infer conclusions from premises in a way that depends only on the form of the argument, not on the persuasiveness of the speaker or on anything not mentioned in the premises. Aristotle’s explanation of a logical deduction is the root of all modern logic.
亚里士多德的技术词汇使得阅读变得困难。幸运的是,过时的细节对我们来说并不重要。他使用谓词的概念,即事物可能具有或不具有的属性。这引发了现代观念,即事物是集合的成员,即具有该属性的所有事物的集合。亚里士多德的“归属”概念可以理解为子集关系。例如,说属性A不属于任何B就是说集合B中没有任何成员是A的成员,即B与A不相交。因此,“如果A谓述每个B且B谓述每个C ,则A必须谓述每个C ”这个例子是对超集关系的传递性的陈述:如果A bas B和B bas C,则A × C。亚里士多德没有这样的符号可供使用,但那些将逻辑和集合论形式化的人站在他的肩膀上。
Aristotle’s technical vocabulary makes for difficult reading. Fortunately, the archaic details are not important to us. He works with the idea of a predicate, a property that a thing may or may not have. This gave rise to the modern idea of a thing being a member of a set, the set of all things having that property. Aristotle’s notion of “belonging” can be understood as the subset relation. For example, to say that property A belongs to none of the Bs is to say that no member of the set B is a member of A, that is, that B is disjoint from A. So the example “if A is predicated of every B and B of every C, it is necessary for A to be predicated of every C” is a statement of the transitivity of the superset relation: If A ⊇ B and B ⊇ C, then A ⊇ C. Aristotle had no such notation at his disposal, but those who formalized logic and set theory stood on his shoulders.
在亚里士多德之前已经有令人信服的数学论证,但亚里士多德是第一个从这些论证的内容中抽象出其形式的人。在此过程中,他展示了如何通过将命题与一般模板相匹配并推断出必然遵循的结论来进行机械推理。亚里士多德既没有设计也没有建造逻辑计算机,但他的词汇暗示他正在描述一个计算过程。这个词在这里翻译为“演绎”是συλλογισμός,意思是“算出”或“计算”。“三段论”是亚里士多德术语的英文翻译,现在意味着更狭义的东西,即亚里士多德在这部著作中解释的特定演绎形式。
There had been convincing mathematical arguments before Aristotle, but Aristotle was the first to abstract the form of such arguments from their content. In doing so, he showed how to reason mechanically, by matching propositions to general templates and inferring the conclusions that necessarily followed. Aristotle neither designed nor built logical calculating machines, but his vocabulary hints that he was describing a computational process. The word translated here as “deduction” is συλλογισμός, which means a “reckoning up” or “computation.” “Syllogism,” the English rendition of Aristotle’s term, now means something narrower, the particular forms of deduction that Aristotle explains in this work.
亚里士多德还展示了一种方法,通过使用反例来证明某些所谓的推理形式并不普遍有效。他邀请读者根据前提A bas B和B ∩ C = ∅得出一般推论。事实上,从这两个前提并不能推出关于A和C关系的必然结论。因为如果我们取A = 动物,B = 马,C = 人,则满足前提(马是动物,但没有马是人),并且A ≤ C(人是动物);但如果A = 动物,B = 人,C = 石头,则前提再次满足(人是动物,但没有人是石头),但A与C不相交(没有石头是动物)。(在现代论证中,我们必须指出,只要 C 非空,C就不能同时是A 的子集和不相交。)这是迄今为止用于反驳猜想并证明独立性的方法的假设。
Aristotle also exhibits a method for showing that certain purported forms of inference are not universally valid, by the use of counterexamples. He invites the reader to draw a general inference from the premises A ⊇ B and B ∩ C = ∅. In fact, no necessary conclusion about the relation of A and C can be inferred from these two premises. For if we take A = animals, B = horses, and C = men, then the premises are satisfied (horses are animals, but no horse is a man), and A ⊇ C (men are animals); but if A = animals, B = men, and C = stones, then again the premises are satisfied (men are animals, but no man is a stone), but A is disjoint from C (no stone is an animal). (In a modern argument, we would have to finish by noting that C cannot simultaneously be a subset of and disjoint from A, as long as C is nonempty.) This is the method used to this day to refute conjectures and to prove the independence of hypotheses.
我们必须首先说明我们的探究是关于什么的,它的对象是什么,说它是关于论证的,它的对象是论证科学。其次,我们要确定什么是前提,什么是项,什么是演绎,什么样的演绎是完全的,什么样的演绎是不完全的;在这些事情之后,什么是某物作为一个整体的存在或不存在,以及我们所说的“以一切为谓语”或“以无为谓语”是什么意思。
WE must first state what our inquiry is about and what its object is, saying that it is about demonstration and that its object is demonstrative science. Next, we must determine what a premise is, what a term is, and what a deduction is, and what sort of deduction is complete and what sort incomplete; and after these things, what it is for something to be or not be in something as a whole, and what we mean by “to be predicated of every” or “predicated of none.”
那么,前提是一个肯定或否定某事的句子。这句话可以是普遍的、特殊的或不确定的。我称“属于所有人”或“不属于任何人”为普遍的;我称属于“某些人”、“不属于某些人”或“不属于所有人”为特殊的,并且我称属于或不属于(没有普遍性或特殊性)不确定(例如,“对立的科学是相同”或“快乐不是好事”)。
A premise, then, is a sentence affirming or denying something about something. This sentence may be universal, particular, or indeterminate. I call belonging “to every” or “to none” universal; I call belonging “to some,” “not to some,” or “not to every,” particular, and I call belonging or not belonging (without a universal or particular) indeterminate (as, for example, “the science of contraries is the same” or “pleasure is not a good”).
论证前提与辩证前提不同,论证前提是对矛盾的一个或另一部分的接受(因为论证的人并不要求前提,而是接受前提),而辩证前提是要求矛盾。然而,这对于任何一个人是否会进行演绎都没有什么区别,对于证明者和提出要求的人来说,通过认为某物属于某物或不属于某物来进行演绎。因此,不带限定条件的演绎前提要么是对另一件事的肯定,要么是对另一件事的否定,正如所解释的那样。如果它是真实的并且是通过初始假设获得的,那么它将具有证明性;另一方面,辩证前提是将矛盾作为问题提出(当一个人得到答案时),并采取一些明显和接受的东西(当一个人进行推论时),正如主题中所解释的那样。
A demonstrative premise is different from a dialectical one in that a demonstrative premise is the taking of one or the other part of a contradiction (for someone who is demonstrating does not ask for premises but takes them), whereas a dialectical premise is the asking of a contradiction. However, this will make no difference as to whether a deduction comes about for either man, for both the one who demonstrates and the one who asks deduce by taking something either to belong or not to belong with respect to something. Consequently, a deductive premise without qualification will be either the affirmation or the denial of one thing about another, in the way that this has been explained. It will be demonstrative if it is true and has been obtained by means of the initial assumptions; a dialectical premise, on the other hand, is the posing of a contradiction as a question (when one is getting answers) and the taking of something apparent and accepted (when one is deducing), as was explained in the Topics.
那么,什么是前提,以及演绎前提、论证前提和辩证前提有何不同,将在下文中更准确地解释;刚才所做的区分足以满足我们目前的需要。
What a premise is, then, and how deductive, demonstrative, and dialectical premises differ, will be explained more precisely in what follows; let the distinctions just made be sufficient for our present needs.
我称其为一个前提可以被分解成的术语,即既是谓语又是它被谓语的术语(无论“是”或“不是”是相加还是相除)。
I call that a term into which a premise may be broken up, i.e., both that which is predicated and that of which it is predicated (whether or not “is” or “is not” is added or divides them).
演绎是一种话语,在这种话语中,某些事物已经被假设,而某些与假设的事物不同的事物是必然的结果,因为这些事物就是如此。我所说的“因为这些事情就是如此”,是指“通过它们而产生”,而“通过它们而产生”,我的意思是“不需要来自外部的进一步术语来实现必要性”。
A deduction is a discourse in which, certain things having been supposed, something different from the things supposed results of necessity because these things are so. By “because these things are so,” I mean “resulting through them,” and by “resulting through them” I mean “needing no further term from outside in order for the necessity to come about.”
如果一个演绎除了为了使必然性变得明显而不需要任何其他东西之外,我就称它是完整的。如果它仍然需要一个或几个附加的东西,而这些附加的东西由于假设的条款而必需,但尚未通过前提来实现,我称它是不完整的。
I call a deduction complete if it stands in need of nothing else besides the things taken in order for the necessity to be evident; I call it incomplete if it still needs either one or several additional things which are necessary because of the terms assumed, but yet were not taken by means of premises.
一件事物作为整体存在于另一事物之中,与一事物以另一事物为基础是相同的。当没有一个主语可以被取而另一个术语不能被表示时,我们使用“谓词为每个”这一表达方式,并且我们同样使用“谓词为无”。
For one thing to be in another as a whole is the same as for one thing to be predicated of every one of another. We use the expression “predicated of every” when none of the subject can be taken of which the other term cannot be said, and we use “predicated of none” likewise.
现在,每个前提要么表达归属,要么必然归属,要么可能归属;对于每个前缀,其中一些是肯定的,另一些是否定的;反过来,在肯定前提和否定前提中,有些是普遍的,有些是部分的,有些是不确定的。
Now, every premise expresses either belonging, or belonging of necessity, or being possible to belong; and some of these, for each prefix respectively, are affirmative and others negative; and of the affirmative and negative premises, in turn, some are universal, some are in part, and some indeterminate.
归属感的普遍私有前提必须根据其条件进行转变。例如,如果没有任何快乐是善,那么任何善也不会是快乐。积极前提必然会转变,尽管不是普遍的,而是部分的。例如,如果每一种快乐都是一种善,那么某些善就会成为一种快乐。在特定前提中,肯定式必须部分转换(因为如果某种快乐是善,那么某种善就会是快乐),但否定性前提则不需要(因为情况并非如此,如果人不属于某种动物) ,那么动物将不属于某个人)。
It is necessary for a universal privative premise of belonging to convert with respect to its terms. For instance, if no pleasure is a good, neither will any good be a pleasure. And the positive premise necessarily converts, though not universally but in part. For instance, if every pleasure is a good, then some good will be a pleasure. Among the particular premises, the affirmative must convert partially (for if some pleasure is a good, then some good will be a pleasure), but the privative premise need not (for it is not the case that if man does not belong to some animal, then animal will not belong to some man).
那么首先让前提AB是普遍私有的。现在,如果A不属于任何B ,那么B也不属于任何A。因为如果它确实属于某个(例如C ),那么A不属于任何B就不是真的,因为C是B之一。如果A属于每个B,那么B将属于某个A。因为如果它不属于任何一个,那么A也不会属于任何B;但它被认为属于每个人。同样,如果前提是特殊的:如果A属于某些B,那么B必然属于某些A。(因为如果它不属于任何一个,那么A也不属于任何B。)但是如果A不属于某个B ,则B也不一定不属于某个A(例如,如果B是动物和人:因为人并不属于所有动物,但动物却属于每个人)。……
First, then, let premise AB be universally privative. Now, if A belongs to none of the Bs, then neither will B belong to any of the As. For if it does belong to some (for instance to C), it will not be true that A belongs to none of the Bs, since C is one of the Bs. And if A belongs to every B, then B will belong to some A. For if it belongs to none, neither will A belong to any B; but it was assumed to belong to every one. And similarly if the premise is particular: if A belongs to some of the Bs, then necessarily B belongs to some of the As. (For if it belongs to none, then neither will A belong to any of the Bs.) But if A does not belong to some B, it is not necessary for B also not to belong to some A (for example if B is animal and A man: for man does not belong to every animal, but animal belongs to every man). …
做出这些决定后,现在让我们说一下每个推论是通过什么前提、何时以及如何产生的。(稍后我们需要讨论演示。演绎应该在演示之前讨论,因为演绎更普遍:演示是一种演绎,但并非所有演绎都是演示。)
Having made these determinations, let us now say through what premises, when, and how every deduction comes about. (We will need to discuss demonstration later. Deduction should be discussed before demonstration because deduction is more universal: a demonstration is a kind of deduction, but not every deduction is a demonstration.)
那么,每当三个术语彼此相关时,最后一个作为整体位于中间,而中间要么在第一个整体中,要么不在第一个整体中,则有必要存在一个完全扣除极端情况。(我称其本身在另一个中并且在其中有另一个的中间——这也是位置上的中间——并且将其本身在另一个中以及在其中有另一个的两者称为极端。)因为如果A被谓词为每一个B和每一个C的B , A都必须对每一个C进行谓词(因为前面已经说明了我们所说的“每一个”的含义)。类似地,如果A被断言为不存在B且B属于每个C ,则A必然不属于任何C。然而,如果第一个极端跟随所有中间,而中间不属于最后一个,则不会有极端的推导,因为这些事情如此,没有必然的结果。因为第一个极端可能属于所有人,但最后一个极端可能不属于任何一个。因此,既不需要特定的结论,也不需要普遍的结论。而且,由于这些都不是必要的,因此不会有任何扣除。属于每个人的术语是动物、人、马;因为不属于任何东西,动物,人,石头。当第一个既不属于中间的任何一个,也不属于最后一个的任何一个时:也不会以这种方式进行演绎。归属感的术语是科学、路线、医学;不属于、科学、路线、单位。
Whenever, then, three terms are so related to each other that the last is in the middle as a whole and the middle is either in or not in the first as a whole, it is necessary for there to be a complete deduction of the extremes. (I call that the middle which both is itself in another and has another in it—this is also middle in position—and call both that which is itself in another and that which has another in it extremes.) For if A is predicated of every B and B of every C, it is necessary for A to be predicated of every C (for it was stated earlier what we mean by “of every”). Similarly, if A is predicated of no B and B of every C, it is necessary that A will belong to no C. However, if the first extreme follows all the middle and the middle belongs to none of the last, there will not be a deduction of the extremes, for nothing necessary results in virtue of these things being so. For it is possible for the first extreme to belong to all as well as to none of the last. Consequently, neither a particular nor a universal conclusion becomes necessary; and, since nothing is necessary because of these, there will not be a deduction. Terms for belonging to every are animal, man, horse; for belonging to none, animal, man, stone. Nor when neither the first belongs to any of the middle nor the middle to any of the last: there will not be a deduction in this way either. Terms for belonging are science, line, medicine; for not belonging, science, line, unit.
因此,如果这些术语是通用的,那么这个数字什么时候需要扣除什么时候不需要扣除就很清楚了;而且很明显,如果有演绎,那么这些术语必然像我们所说的那样相关,如果它们以这种方式相关,那么就会有演绎。
Thus, it is clear when there will and when there will not be a deduction in this figure if the terms are universal; and it is also clear both that if there is a deduction, then the terms must necessarily be related as we have said, and that if they are related in this way, then there will be a deduction.
如果其中一个术语相对于其余术语而言是全称的,而另一个术语是特称的,则当将普遍性与主要极端(无论是积极的还是私有的)相关联而将特殊性与次要极端相关时(为正),那么必然会完全扣除;然而,当普遍性与次要极端相关时,或者当这些术语以任何其他方式相关时,这是不可能的。(我将中间所在的极端称为“主要”,将中间以下的极端称为“次要”。)让A属于每个B,而B属于某个C。那么,如果要谓述 every 就是开头所说的,那么A必然属于某个C。如果A不属于任何B而B属于某个C ,那么A必然不属于某个C。(因为它也被定义为我们所说的“不谓词”,这样就会有一个完整的演绎。)同样,如果BC应该是不确定的,只要它是正的(因为无论是不确定的前提,这都将是相同的演绎)或采取特定的一个)。……
If one of the terms is universal and the other is particular in relation to the remaining term, then when the universal is put in relation to the major extreme (whether this is positive or privative) and the particular is put in relation to the minor extreme (which is positive), then there will necessarily be a complete deduction; when, however, the universal is put in relation to the minor extreme, or when the terms are related in any other way, this is impossible. (I call that extreme the “major” which the middle is in and that extreme the “minor” which is under the middle.) For let A belong to every B and B to some C. Then, if to be predicated of every is what was said in the beginning, it is necessary for A to belong to some C. And if A belongs to no B and B to some C, then it is necessary for A not to belong to some C. (For it has also been defined what we mean by “predicated of no” so that there will be a complete deduction.) Similarly also if BC should be indeterminate, provided it is positive (for it will be the same deduction whether an indeterminate premise or a particular one is taken). …
经 Hackett Publishing Company, Inc. 许可,转载自亚里士多德 (1989)。
Reprinted from Aristotle (1989), with permission from Hackett Publishing Company, Inc.
戈特弗里德·威廉·莱布尼茨(Gottfried Wilhelm Leibniz,1646-1716 年)是一位博学者、一位博学的哲学家、一位法律和政治思想家,也是一位深刻而多产的数学家。他将角逐第一位计算机科学家的头衔。帕斯卡可能被考虑用于他的加法机,以及之前和之后的其他机器,但莱布尼茨构建了一个既可以乘法又可以除法的嵌套循环计算器(参见第 62 页)。更值得注意的是,他发明了二进制算术(图2.1)并设计了一个二进制计算器(但从未被制造出来)。
Gottfried Wilhelm Leibniz (1646–1716) was a polymath—a philosopher of great breadth and a legal and political thinker, as well as a profound and prolific mathematician. He would be in the running for the title of first computer scientist. Pascal might be considered for his adding machine, and others before and after, but Leibniz built a nested-loop calculator that could both multiply and divide (see page 62). More remarkably, he invented binary arithmetic (Figure 2.1) and designed a binary calculator (which was never built).
图 2.1: 二进制到十进制的转换和二进制和,来自莱布尼茨的De Progresse dyadica (1679)。
Figure 2.1: Binary to decimal conversion and a binary sum, from Leibniz’s De progressione dyadica (1679).
莱布尼茨与艾萨克·牛顿共同发现了当时所谓的无穷小微积分。今天,微积分与数学如此相似,以至于我们不再听到作为其最初动机的计算的参考,例如,如何通过将薄片的面积相加来找到图形的面积。莱布尼茨关于x的无穷小变化的符号dx至今仍然存在,因为该符号可以很容易地表述用牛顿点符号几乎无法表达的即插即用规则。例如,使用牛顿符号表示y的导数来表述恒等式是非常尴尬的。
Leibniz shares credit with Isaac Newton for discovering what was then called the calculus of infinitesimals. Today the calculus is so identified with mathematics that we no longer hear the reference to the calculations that were its original motivations—for example, how to find the area of a figure by adding up the areas of thin slices. Leibniz’s notation dx for an infinitesimal change in x survives to this day, because the notation makes it easy to state plug-and-chug rules that are almost inexpressible in Newton’s dot notation. For example, the identity is very awkward to state using Newton’s notation for the derivative of y.
莱布尼茨认识到良好的符号如何有助于清晰的思维。他在 14 岁时被介绍认识了亚里士多德,作为他教育的一部分,他写了一篇关于在法律推理中使用系统化逻辑的论文(Leibniz,1666)。他开发了逻辑推理的形式符号,这是数理逻辑的一种早期形式(Struik,1969,第 123 页)。他在具有正式规则的逻辑系统中捕捉推理的努力演变成了一个宏伟的计划,将所有人类知识统一到一个能够解决所有争端的系统中;事实一旦确定,就会得出无可争议的答案。所有人类推理都将被简化为堵塞和突突,其结果将是一个明确更美好的世界。莱布尼茨著名的乐观主义——他相信我们生活的世界是所有可能的世界中最好的——就这样与早期的技术乌托邦主义融为一体。
Leibniz recognized how good notation can contribute to clear thought. He had been introduced to Aristotle at age 14, and as part of his education wrote a thesis on the use of a systematized logic in legal reasoning (Leibniz, 1666). He developed formal notations for logical reasoning, an early form of mathematical logic (Struik, 1969, page 123). His effort to capture reasoning in a logical system with formal rules evolved into a grand plan to unify all of human knowledge in a system that would settle all disputes; the facts, once established, would then yield incontestable answers. All of human reasoning would be reduced to plugging and chugging, and the result would be an unequivocally better world. Leibniz’s famous optimism—his confidence that the world we live in was the best of all possible worlds—thus melded with an early techno-utopianism.
尽管他因数学贡献而受到认可,但他的乐观却遭到嘲笑。1759年,伏尔泰在《老实人》中将他讽刺为潘格洛斯博士。今天他的梦想是通过逻辑和机械化推理实现的完美世界似乎很幼稚,而且不太适合他所延伸的神学框架。然而,他想象中的逻辑还原论经常再次出现,尤其是在 McCulloch 和 Pitts(1943 年,此处第 83 页)和 Bush(1945a,此处第 114 页)中,并且通常在每个自动化决策支持系统中。
Even as he was recognized for his mathematical contributions, he was ridiculed for his optimism. In 1759 Voltaire caricatured him as Dr. Pangloss in Candide. Today his dream of a perfect world through logic and mechanized reasoning seems naïve and a poor fit for the theological framework on which he stretched it. And yet his imagined logical reductionism reappears regularly—recognizably in McCulloch and Pitts (1943, here page 83) and Bush (1945a, here page 114), and generally in every automated decision support system.
既然幸福在于满足,而持久的满足取决于我们对未来的保证——这种保证基于我们对上帝和灵魂的本质应该拥有的知识——因此,知识对于真正的幸福是必要的。
SINCE happiness consists in contentment, and since enduring contentment depends on the assurance we have of the future—assurance based on the knowledge we should have of the nature of God and the soul—it follows that knowledge is necessary for true happiness.
但知识依赖于论证,而论证是通过某种方法发明的,这并不是所有人都知道的。因为,尽管每个人都有能力判断论证(因为如果所有认真思考论证的人都没有被它说服和说服,那么它就不配得上这个名字了),但并不是每个人都有能力主动设计论证,也不是每个人都能够自行设计论证。一旦发现它们,就明确地提出它们,因为缺乏闲暇或方法。
But knowledge depends upon demonstration, and the invention of demonstrations by a certain method, which is not known to everyone. For although every man is capable of judging a demonstration (since it would not deserve this name if all those who consider it attentively were not convinced and persuaded by it), nevertheless not every man is capable of devising demonstrations on his own initiative, nor to propose them clearly once they are found, for want of leisure or method.
在我看来,真正的方法在其所有范围内迄今为止都是未知的,除了数学之外还没有被实践过。就数学本身而言,它仍然非常不完美,正如我有幸通过令人惊讶的证明向一些人(他们今天被认为是本世纪最重要的数学家之一)展示的那样。我希望提供一些例子,这些例子也许不值得后人学习。
The true method taken in all of its extent is to my mind a thing hitherto quite unknown, and has not been practised except in mathematics. It is still very imperfect with regard to mathematics itself, as I had the good fortune to show to some (who are considered today to be among the foremost mathematicians of the century) by means of surprising proofs. And I expect to offer some examples of it, which will be perhaps not unworthy of posterity.
然而,如果数学家的方法不足以发现他们所希望的一切,那么它至少能够使他们免于错误,如果他们没有说出他们应该说的一切,那么他们也没有说出他们应该说的一切。不应该说。
Yet if the method of mathematicians has not been sufficient to discover all that could be wished from them, it has at least been able to save them from mistakes, and if they have not said everything they ought to say, they have also said nothing they ought not to say.
如果那些培养其他科学的人至少在这一点上模仿数学家,我们会非常高兴,我们早就拥有了可靠的形而上学,以及依赖于它的道德,因为形而上学包含了关于上帝和灵魂,知识应该统治我们的生活。
If those who have cultivated the other sciences had imitated the mathematicians at least on this point, we would be very happy, and we would have long since had a secure metaphysics, as well as the morals which depend upon it, since metaphysics contains knowledge of God and the soul, knowledge which should govern our life.
此外,我们还将拥有运动科学,这是物理学乃至医学的关键。确实,我相信我们现在正处于一种渴望实现这一目标的状态,而我最初的一些想法,由于它们的简单性,受到了我们这个时代最有学问的人的热烈欢迎,我相信我们现在只需要进行适当设计和考虑的某些实验(而不是像通常发生的那样通过偶然和反复试验),以便在此基础上建立某种示范性物理学的堡垒。
Moreover, we would have the science of motion, which is the key to physics and, consequently, to medicine. It is true I believe we are now in a state to aspire to it, and some of my first thoughts, because of their wonderful simplicity, have been received with such applause by the most learned of our time that I believe we now have only to perform certain experiments properly designed and considered (rather than by chance and by trial and error, as commonly happens) in order to erect thereupon the bastion of a certain and demonstrative physics.
现在,到目前为止,演示艺术只在数学中被发现的原因尚未被任何人正确理解,因为如果知道困难的原因,那么补救措施早就被发现了。原因是数学有它自己的测试。因为当我遇到一个错误的定理时,我不需要检验它,甚至不需要知道证明,因为我将通过一个简单的实验来发现它的错误性,这个实验只需要墨水和纸张,即通过计算,这将揭示错误,无论它有多小。如果在其他事情上通过实验验证推理也同样容易,那么就不会存在会有如此不同的意见。但麻烦在于,物理学中的实验很困难且成本很高,而在形而上学中,除非上帝为了我们而创造奇迹,让我们了解遥远的非物质事物,否则它们是不可能的。
Now, the reason the art of demonstration has been until now found only in mathematics has not been properly fathomed by anyone, for if the cause of the difficulty had been known, the remedy would have long since been discovered. The reason is that mathematics carries its own test with it. For when I am presented with a false theorem, I do not need to examine it or even to know the demonstration, since I shall discover its falsity a posteriori by an easy experiment, which costs nothing but ink and paper, that is, by calculation, which will reveal the error, no matter how small it is. If it were as easy in other matters to verify reasoning by experiments then there would not be such differing opinions. But the trouble is that experiments in physics are difficult and have a high cost, and in metaphysics they are impossible unless God, for our sake, performs a miracle to make remote immaterial things known to us.
这个困难并非不可克服,尽管乍一看似乎是这样。但那些想要考虑我要说的内容的人很快就会改变主意。那么,必须指出的是,在数学中进行的测试或实验是为了防止错误推理(例如,抛出九的测试、科隆的鲁道夫关于圆的大小的计算、正弦表)或其他)不是在事物本身上制作的,而是在我们取代该事物的字符上制作的。[编辑:“剔除九”是通过重复将一个数字的数字相加来测试该数字是否能被 9 整除。亚里士多德通过计算正 96 边形的面积来近似π的值;Ludolph (1540–1610) 使用 2 62边形来代替,从而将π计算到小数点后 35 位,这是一项花费数年时间的工作(Ludolph van Ceulen,1596)。] 用于计算数字,例如如果 1677 次365 等于 612,105,如果必须堆 365 堆,每堆放 1677 块小石头,然后最后数一数,才能知道是否找到上述数字,那是不可能的。这就是为什么我们满足于通过九分测试或其他方式用纸上的字符来做到这一点。同样,当有人提出所谓的圆的精确正交时,我们不需要制作一个材料圆并在其周围系一根线来查看该线的长度或周长与直径是否与所提出的比例相同;这将是很困难的,因为即使误差是直径的千分之一(或更小),也需要以很高的精度构造一个大圆。然而,我们仍然通过实验和数字计算或测试来反驳这种错误的正交。但这种测试仅在纸上进行,因此是在代表事物的字符上进行,而不是在事物本身上进行。
This difficulty is not insurmountable, although at first it seems to us that it is. But those who will want to consider what I am going to say about it will soon change their mind. It must be noted, then, that the tests or experiments performed in mathematics to guard against false reasoning (as are, for example, the test of casting out nines, the calculation of Ludolph of Cologne concerning the size of the circle, tables of sines or others) are not made on the thing itself, but on the characters we have substituted in place of the thing. [EDITOR: “Casting out nines” is to test whether a number is divisible by 9 by repeatedly adding up its digits. Aristotle had approximated the value of π by calculating the area of a regular 96-sided polygon; Ludolph (1540–1610) used a 262-sided polygon instead, thus calculating π to 35 decimal places, an effort that took years (Ludolph van Ceulen, 1596).] For to take a calculation of numbers, for example if 1677 times 365 makes 612,105, it would never have been done if one had to make 365 heaps and put 1677 small stones in each one and then at length count them all in order to know if the aforementioned number is found. That is why we are content to do it with characters on paper, by means of the test of nines or some other. Likewise, when someone proposes a supposedly exact quadrature of the circle, we do not need to make a material circle and tie a thread around it in order to see whether the length of this thread or the circumference to the diameter has the proportion proposed; that would be difficult, for even if the error is a thousandth (or less) part of the diameter, a large circle constructed with a great deal of accuracy would be required. Yet we nonetheless refute this false quadrature by experiment and by the calculation or test in numbers. But this test is performed only on paper, and consequently on the characters which represent the thing, and not on the thing itself.
这种考虑在这个问题上是根本性的,尽管许多非常有能力的人,特别是在我们这个世纪,声称给我们提供了有关物理学、形而上学、道德,甚至政治学、法学和医学的论证,但他们要么是错误的(因为所有的台阶都很滑,除非有某些方向的引导,否则很难不摔倒),或者即使他们真的碰到了,他们也无法使他们的论点被普遍接受(因为还没有办法检验)通过一些每个人都能进行的简单测试来论证)。
This consideration is fundamental in this matter, and although many very able people, especially in our century, have claimed to give us demonstrations regarding physics, metaphysics, morals, and even in politics, jurisprudence, and medicine, nevertheless either they have been mistaken (because all the steps are slippery and it is difficult not to fall unless guided by some directions), or even if they did hit upon them, they have been unable to make their arguments accepted universally (because there has not yet been a way to examine arguments by some easy tests of which everyone is capable).
由此可见,如果我们能找到适合表达我们所有思想的文字或符号,就像算术表达数字或几何分析表达线条一样清晰准确,我们就可以完成所有事情,只要它们是经得起推理的。这可以通过算术和几何来完成。
From this it is clear that, if we could find characters or signs appropriate for expressing all our thoughts as clearly and exactly as arithmetic expresses numbers or geometric analysis expresses lines, we could accomplish in all matters, insofar as they are amenable to reasoning, everything that can be done in arithmetic and geometry.
因为所有依赖于推理的询问都将通过这些字符的换位和某种计算来进行,这将使美丽事物的发明变得相当容易。因为我们不必像今天被迫那样绞尽脑汁,而且尽管如此,我们确信能够根据既定事实完成一切可行的事情。
For all inquiries that depend upon reasoning would be performed by the transposition of these characters and by a kind of calculation, which would make the invention of beautiful things quite easy. For we would not have to rack our brains as much as we are forced to do today, and nevertheless we would be sure of being able to accomplish everything feasible, in accordance with the given facts.
此外,每个人都会对所发现或得出的结论达成一致,因为通过重复计算或尝试一些类似于算术中剔除九的测试来验证计算很容易。如果有人怀疑我所做的事情,我会对他说:“先生,让我们计算一下。”然后拿起笔和墨水,我们应该很快就能解决问题。
Moreover, everyone would be made to agree on what had been found or concluded, since it would be easy to verify the calculation either by repeating it or by trying some tests similar to that of casting out nines in arithmetic. And if someone were to doubt what I had done, I would say to him, “Let us calculate, Sir,” and thus taking up pen and ink we should soon settle the matter.
我总是补充:只要可以根据给定的事实进行推理即可。因为虽然总是需要某些实验来作为推理的基础,但是,一旦给出这些实验,我们就会从中汲取其他人可能从中汲取的一切,甚至会发现还有待进行的实验。澄清所有剩余的疑问。即使在政治和医学领域,这对于以稳定和完美的方式推理给定的症状和情况来说也是一个令人钦佩的帮助。因为即使没有足够的给定情况来形成无误的判断,我们始终能够根据给定的事实确定最有可能的情况。这就是理性所能做的一切。
I always add: insofar as can be done by reasoning, in accordance with the given facts. For although certain experiments are always needed to serve as a basis for reasoning, nevertheless, once these experiments are given, we would draw from them everything that anyone else could possibly draw from them, and would even discover the experiments which remain to be performed for the clarification of all remaining doubts. That would be an admirable help, even in politics and medicine, for reasoning about the given symptoms and circumstances in a steady and perfect way. For even though there will not be enough given circumstances to form an infallible judgement, we shall always be able to determine what is most probable from the given facts. And that is all reason can do.
现在,表达我们所有思想的文字将形成一种可以书写和说出的新语言;这种语言很难构建,但很容易学习。由于其巨大的用途和令人惊讶的便利,它将很快被每个人所接受,并且它将为许多民族之间的交流提供极好的服务,这将有助于其被接受。那些用这种语言写作的人不会犯错误,只要他们避免计算错误、野蛮行为、语法错误以及其他语法和结构错误。此外,这种语言将拥有一个奇妙的特性,即让无知的人闭嘴。因为除了他所理解的内容之外,一个人将无法用这种语言说或写,或者如果一个人试图这样做,就会发生以下两种情况之一:要么先进的东西的虚荣性对每个人来说都是显而易见的,要么它将被通过写作或口语学习。确实,那些通过写作来计算和说话的人有时会遇到他们想象不到的成功,因为舌头跑在了头脑的前面。由于其精确性,这种情况尤其会发生在我们的语言中。如此一来,就不会有任何含糊其辞或模棱两可的情况,其中所有可以理解的内容都将被恰当地表达出来。
Now the characters that will express all our thoughts will form a new language which can be written and spoken; this language will be very difficult to construct but very easy to learn. It will be quickly accepted by everyone on account of its great use and its surprising facility, and it will serve wonderfully for communication among many peoples, which will help make it accepted. Those who will write in this language will not make mistakes, provided they avoid the errors of calculation, barbarisms, solecisms, and other mistakes of grammar and construction. Moreover, this language will possess a wonderful property, namely that of silencing the ignorant. For one will be unable to speak or write in this language except about what he understands, or if one tries to do so, one of two things will happen: either the vanity of what is advanced will be obvious to everyone, or it will be learned by writing or speaking. Just as indeed those who calculate learn by writing and those who speak sometimes encounter success they did not imagine, with the tongue running ahead of the mind. This will happen especially in our language because of its exactness. So much so that there will be no equivocations or amphibolies, and everything that will be said intelligibly in it will be said with propriety.
我敢说,这是人类心灵的最高努力,当这个项目完成时,人们只会感到高兴,因为他们将拥有一种可以提升理性的仪器,就像望远镜可以完善理性一样。想象。
I dare say that this is the highest effort of the human mind, and when the project is accomplished it will merely be up to men to be happy since they will have an instrument which will serve to exalt reason no less than the telescope serves to perfect vision.
如果上帝给我时间的话,完成这个项目是我的愿望之一。我只把它归功于我自己,我在十八岁的时候第一次想到了它,正如我稍后在印刷版话语中所证明的那样(Leibniz,1666)。我确信没有任何一项发明可以与这项发明相媲美,因此我相信没有任何一项发明能够如此让发明家的名字永垂不朽。但我有更强烈的理由来思考这个问题,因为我所信仰的宗教向我保证,上帝的爱在于追求公共利益的热切愿望,而理性告诉我,没有什么比这对公共利益的贡献更大了。最完美的理性。
It is one of my ambitions to finish this project if God grants me the time. I owe it only to myself, and I had the first thought about it at the age of eighteen, as I evidenced a little later in a printed discourse (Leibniz, 1666). And as I am certain there is no invention which comes close to this one, I believe there is nothing so capable of immortalizing the name of the inventor. But I have much stronger reasons for thinking about it, for the religion I follow closely assures me that the love of God consists in an ardent desire to procure the general good, and reason teaches me that there is nothing which contributes more to the general good of all men than what perfects reason.
经劳埃德·斯特里克兰 (Lloyd Strickland) 许可,转载自莱布尼茨 (2020)。图片经戈特弗里德·威廉·莱布尼茨图书馆许可复制。
Reprinted from Leibniz (2020), with permission from Lloyd Strickland. Images reproduced with permission from Gottfried Wilhelm Leibniz Bibliothek.
查尔斯·巴贝奇(Charles Babbage,1791-1871)的分析机是第一个可以合理地称为计算机而不是计算器的设备,因此巴贝奇的助手艾达·洛夫莱斯(Ada Lovelace)是第一批计算机程序员之一。本章记录了许多其他第一,包括第一个被宣传为几乎可以运行但从未正常运行的计算机系统。巴贝奇是英国著名学者、剑桥大学卢卡斯数学教授——艾萨克·牛顿和斯蒂芬·霍金等人都曾担任过这一职位。他的“差分引擎”是一个齿轮和轮子系统,可用于计算多项式的值,从而根据累积差异的原理计算各种数学函数的近似值。最简单的例子来自恒等式,因此完美平方序列可以通过重复添加连续的奇数(即从 1 开始且相差 2 的数字)来生成。在 1830 年代中期,巴贝奇构思了分析机,这种设备的功能远超巴贝奇的想象。
The Analytical Engine of Charles Babbage (1791–1871) was the first device that could reasonably be called a computer rather than a calculator, so Babbage’s assistant Ada Lovelace was among the first computer programmers. This chapter documents many other firsts—including the first computer system pitched as almost operational that never worked properly. Babbage was an eminent British academic, Lucasian Professor of Mathematics at the University of Cambridge—a post held by Isaac Newton and Stephen Hawking, among others. His “Difference Engine” was a system of gears and wheels that could be used to calculate the values of polynomials, and hence approximations to a variety of mathematical functions, based on the principle of accumulating differences. The simplest example arises from the identity , so that the sequence of perfect squares can be produced by repeatedly adding successive odd numbers, that is, numbers that start at 1 and differ by 2. In the mid-1830s Babbage conceived of the Analytical Engine, a device that was capable of much more—more, indeed, than Babbage imagined.
阿达·奥古斯塔(Ada Augusta,1815-1852 年)是浪漫诗人拜伦勋爵的女儿,拜伦勋爵出生后不久就抛弃了家人,逃往希腊,八年后在那里去世。艾达的母亲心怀怨恨,决心让艾达自己不会成为一个愚蠢的浪漫主义者,为她提供了女性所能获得的最好的数学教育。艾达向巴贝奇当学徒,在熟悉了他的计算机器后,开始想象分析机的可能性。该设备将按照提花织机的模型构建,提花织机使用穿孔卡来控制其编织的图案。巴贝奇和阿达·奥古斯塔意识到控制机制是如此通用,以至于分析机可以进行几乎无限复杂的计算。
Ada Augusta (1815–1852) was the daughter of the romantic poet Lord Byron, who abandoned the family shortly after her birth and decamped to Greece, where he died eight years later. Ada’s embittered mother, determined that Ada would not herself become a silly romantic, provided her the best mathematical education available to women. Ada apprenticed herself to Babbage, and after becoming familiar with his calculating machines, began to imagine the Analytical Engine’s possibilities. The device was to be constructed on the model of a Jacquard loom, which used punched cards to control the pattern it would weave. Babbage and Ada Augusta realized that the control mechanism was so general that the Analytical Engine could effect calculations of almost unlimited sophistication.
可以——如果它真的可以建造的话。遗憾的是,当时的加工精度不足以构建本文档中描述的规模的可用分析引擎。分析引擎从未按计划运行。(当巴贝奇的注意力转向分析机时,全尺寸差分机的建造被暂停。使用在失败的分析机工作期间开发的加工技术最终完成了一个版本。)
Could—if it could be built at all. Alas, the precision of machining available in the day was inadequate to construct a workable Analytical Engine at the scale described in this document. The Analytical Engine was never operational as planned. (Construction of the full-scale Difference Engine was suspended when Babbage’s attention turned to the Analytical Engine. A version was finally completed using machining techniques developed during the work on the failed Analytical Engine.)
1840 年,巴贝奇在意大利就他的分析机进行了演讲。路易吉·费德里科·梅纳布雷亚 (Luigi Federico Menabrea,1809-1896 年) 是一位意大利工程师,他听了演讲并将演讲内容写了下来,而艾达 (Ada,当时已与洛夫莱斯伯爵结婚,因此被命名为艾达·奥古斯塔·洛夫莱斯)翻译并注释了 Menabrea 的文章。它提到的现在熟悉的编程概念的数量相当惊人: 编译(第14 页上的图 3.1实际上是代数表达式的汇编语言编译);计步(第 21 页);由整数索引控制的循环(第 24 页);条件分支和嵌套循环(第 23 页);考虑代码大小(第 25 页);异常处理(第 14 页);线性系统的机械解决方案(第 24 页);数据和代码之间的二分法(第 12 页);非数值计算(第 17 和 23 页);低估编程的难度(第 15 页);以及未被发现的计算本体的重复暗示。
Babbage lectured in Italy on his Analytical Engine in 1840. Luigi Federico Menabrea (1809–1896) was an Italian engineer who heard the lectures and wrote them up, and Ada (by then married to the Count of Lovelace, and therefore named Ada Augusta Lovelace) translated and annotated Menabrea’s write-up. The number of now-familiar programming concepts it mentions is quite staggering: compilation (Figure 3.1 on page 14 is effectively an assembly-language compilation of an algebraic expression); step counting (page 21); loops controlled by an integer index (page 24); conditional branches and nested loops (page 23); accounting for code size (page 25); exception handling (page 14); mechanical solution of linear systems (page 24); the dichotomy between data and code (page 12); nonnumeric computing (pages 17 and 23); the underestimation of the difficulties of programming (page 15); and the repeated hints of an undiscovered ontology of computations.
艾达·洛夫莱斯 (Ada Lovelace) 因子宫癌去世,享年 36 岁,并接受放血治疗。尽管巴贝奇后来设想了可以在棋盘游戏和其他活动自动化中获胜的机器,但他的发明在数字革命之前基本上被遗忘了。霍华德·艾肯了解巴贝奇的工作,并认为自己是巴贝奇的继承人,但他的欣赏可能是在他承担了自动计算器的设计之后才产生的。巴贝奇和洛夫莱斯走过了这个领域壮观的荒野,但他们技术的缺陷和时间的流逝在很大程度上抹去了他们开辟的道路,直到其他人开始再次追随。
Ada Lovelace died at age 36 of uterine cancer and the blood-letting done to treat it. Though Babbage later imagined machines that could win at board games and the automation of other activities, his inventions were largely forgotten until the digital revolution. Howard Aiken knew of Babbage’s work and considered himself Babbage’s heir, but his appreciation probably came only after he had undertaken the design of his automatic calculator. Babbage and Lovelace trod the spectacular wilderness of the field, but the shortcomings of their technology and the passage of time largely obliterated the path they blazed until others began to follow it again.
帕斯卡这台备受推崇的机器现在只是一个令人好奇的对象,虽然它显示了其发明者的强大智慧,但它本身却没有什么用处。它的权力仅限于执行前四个算术运算,实际上仅限于前两个算术运算,因为乘法和除法是一系列加法和减法的结果。迄今为止,大多数此类机器的主要缺点是,它们需要人类代理人的持续干预来调节它们的运动,从而产生错误的根源;因此,如果它们的使用尚未普遍用于大型数值计算,那是因为它们实际上没有解决问题所提出的双重问题,即结果的正确性与时间的经济性。
THE much-admired machine of Pascal is now simply an object of curiosity, which, whilst it displays the powerful intellect of its inventor, is yet of little utility in itself. Its powers extended no further than the execution of the first four operations of arithmetic, and indeed were in reality confined to that of the first two, since multiplication and division were the result of a series of additions and subtractions. The chief drawback hitherto on most of such machines is, that they require the continual intervention of a human agent to regulate their movements, and thence arises a source of errors; so that, if their use has not become general for large numerical calculations, it is because they have not in fact resolved the double problem which the question presents, that of correctness in the results, united with economy of time.
出于类似的思考,巴贝奇先生花费了数年时间来实现一个宏伟的想法。他向自己提议建造一台机器,不仅能够执行算术计算,而且能够执行所有分析计算(如果它们的定律已知)。起初,人们对这样一项事业的想法感到震惊。但我们越冷静地反思,成功就越不可能出现,并且人们认为,它可能取决于某种普遍原理的发现,如果应用于机器,后者可能能够机械地翻译可以通过代数符号向其指示的操作。……
Struck with similar reflections, Mr. Babbage has devoted some years to the realization of a gigantic idea. He proposed to himself nothing less than the construction of a machine capable of executing not merely arithmetical calculations, but even all those of analysis, if their laws are known. The imagination is at first astounded at the idea of such an undertaking; but the more calm reflection we bestow on it, the less impossible does success appear, and it is felt that it may depend on the discovery of some principle so general, that, if applied to machinery, the latter may be capable of mechanically translating the operations which may be indicated to it by algebraical notation. …
当采用分析来解决任何问题时,通常要执行两类操作:第一,各种系数的数值计算;第二,计算各种系数。其次,它们的分布与受它们影响的数量有关。例如,如果我们要获得两个二项式的乘积 ( a + bx )( m + nx ),结果将表示为am + ( an + bm ) x + bnx 2,其中我们必须首先计算表达式am、an、bm、bn;然后取an + bm之和;最后,将得到的系数分别分配到变量的幂中。为了通过机器再现这些操作,机器必须拥有两组不同的能力:第一,执行数值计算的能力;第二,执行数值计算的能力。其次,正确分配所获得的价值。
When analysis is employed for the solution of any problem, there are usually two classes of operations to execute: first, the numerical calculation of the various coefficients; and secondly, their distribution in relation to the quantities affected by them. If, for example, we have to obtain the product of two binomials (a + bx)(m + nx), the result will be represented by am + (an + bm)x + bnx2, in which expression we must first calculate am, an, bm, bn; then take the sum of an + bm; and lastly, respectively distribute the coefficients thus obtained amongst the powers of the variable. In order to reproduce these operations by means of a machine, the latter must therefore possess two distinct sets of powers: first, that of executing numerical calculations; secondly, that of rightly distributing the values so obtained.
但是,如果指导每一个部分操作都需要人为干预,那么在正确性和时间经济性的标题下就不会获得任何好处;因此,当引入同一问题的原始数值数据时,机器必须具有自行执行解决向其提出的问题所需的所有连续操作的附加条件。因此,由于从要执行的计算或要解决的问题的性质已被指示给它的那一刻起,机器就通过其自身的内在力量,自行完成所有中间操作,这些操作导致对于所提出的结果,必须排除一切尝试和猜测的方法,而只能承认直接的计算过程。
But if human intervention were necessary for directing each of these partial operations, nothing would be gained under the heads of correctness and economy of time; the machine must therefore have the additional requisite of executing by itself all the successive operations required for the solution of a problem proposed to it, when once the primitive numerical data for this same problem have been introduced. Therefore, since, from the moment that the nature of the calculation to be executed or of the problem to be resolved have been indicated to it, the machine is, by its own intrinsic power, of itself to go through all the intermediate operations which lead to the proposed result, it must exclude all methods of trial and guess-work, and can only admit the direct processes of calculation.
必然如此;因为机器不是一个会思考的存在,而只是一个按照强加于它的法则行动的自动机。这是其作者必须进行的最早的研究之一,这是最基本的,即找到一种方法来实现一个数字除以另一个数字,而不使用通常算术规则所指示的猜测方法。实现这种结合的难度绝非最小的。但这取决于其他所有人的成功。由于我无法在这里解释达到这一目的的过程,因此我们必须承认算术的前四种运算,即加法、减法、乘法和除法,可以通过干预以直接的方式执行。机器的。当然,机器因此能够执行各种数值计算,因为所有此类计算最终都会分解为我们刚才命名的四种运算。为了构想机器现在如何根据既定的法则执行其功能,我们首先要了解它在物质上表示数字的方式。
It is necessarily thus; for the machine is not a thinking being, but simply an automaton which acts according to the laws imposed upon it. This being fundamental, one of the earliest researches its author had to undertake, was that of finding means for effecting the division of one number by another without using the method of guessing indicated by the usual rules of arithmetic. The difficulties of effecting this combination were far from being among the least; but upon it depended the success of every other. Under the impossibility of my here explaining the process through which this end is attained, we must limit ourselves to admitting that the first four operations of arithmetic, that is addition, subtraction, multiplication and division, can be performed in a direct manner through the intervention of the machine. This granted, the machine is thence capable of performing every species of numerical calculation, for all such calculations ultimately resolve themselves into the four operations we have just named. To conceive how the machine can now go through its functions according to the laws laid down, we will begin by giving an idea of the manner in which it materially represents numbers.
让我们设想一个由无限数量的圆盘组成的桩或垂直柱,所有圆盘都被一根公共轴穿过它们的中心,每个圆盘都可以绕着该轴进行独立的旋转运动。如果在每个圆盘的边缘写有构成我们的数字字母表的十个数字,那么我们可以通过将一系列这些数字排列在同一垂直线上,以这种方式表达任何数字。为此目的,第一个圆盘代表个位,第二个圆盘代表十位,第三个圆盘代表百位,依此类推。当两个数字被这样写在两个不同的列上时,我们可以建议将它们彼此进行算术组合,并在第三列上获得结果。一般来说,如果我们有一系列由圆盘组成的列,我们将这些列指定为V 0、V 1、V 2、V 3、V 4等,例如,我们可能需要除以列V 1上写的数字通过V 4列的结果,得到V 7列的结果。为了实现这一操作,我们必须赋予机器两种不同的安排:通过第一个,它准备执行除法,通过第二个,向它指示要操作的列,以及要表示结果的列。如果在这种除法之后,例如将其他列上的两个数字相加,则机器的两个原始布置必须同时改变。相反,如果要进行一系列相同性质的操作,那么最初的安排将保留,而只有第二个必须改变。因此,可以与机器的各个部分通信的布置可以分为两个主要类别:
Let us conceive a pile or vertical column consisting of an indefinite number of circular discs, all pierced through their centres by a common axis, around which each of them can take an independent rotatory movement. If round the edge of each of these discs are written the ten figures which constitute our numerical alphabet, we may then, by arranging a series of these figures in the same vertical line, express in this manner any number whatever. It is sufficient for this purpose that the first disc represent units, the second tens, the third hundreds, and so on. When two numbers have been thus written on two distinct columns, we may propose to combine them arithmetically with each other, and to obtain the result on a third column. In general, if we have a series of columns consisting of discs, which columns we will designate as V0, V1, V2, V3, V4, &c., we may require, for instance, to divide the number written on the column V1 by that on the column V4, and to obtain the result on the column V7. To effect this operation, we must impart to the machine two distinct arrangements; through the first it is prepared for executing a division, and through the second the columns it is to operate on are indicated to it, and also the column on which the result is to be represented. If this division is to be followed, for example, by the addition of two numbers taken on other columns, the two original arrangements of the machine must be simultaneously altered. If, on the contrary, a series of operations of the same nature is to be gone through, then the first of the original arrangements will remain, and the second alone must be altered. Therefore, the arrangements that may be communicated to the various parts of the machine may be distinguished into two principal classes:
首先,相对于Operations。
First, that relative to the Operations.
其次,相对于Variables。
Secondly, that relative to the Variables.
后者是指指示要操作的列。至于操作本身,它们是由一个特殊的装置执行的,该装置由名称mill指定,并且它本身包含一定数量的列,类似于变量的列。当两个数字要组合在一起时,机器首先将它们从写入它们的列中擦除,也就是说,它在表示数字的两条垂直线的每个圆盘上放置零;并将数据传输至工厂。在那里,设备已被适当地设置用于所需的操作,后者被实现,并且当完成时,结果本身被转移到应已被指示的变量列。因此,磨机是机器工作的部分,变量列构成了表示和排列结果的部分。经过前面的解释,我们可能会发现,所有分数和无理数结果都将用小数表示。假设每列有四十个圆盘,这个扩展将足以满足通常需要的所有近似程度。
By this latter we mean that which indicates the columns to be operated on. As for the operations themselves, they are executed by a special apparatus, which is designated by the name of mill, and which itself contains a certain number of columns, similar to those of the Variables. When two numbers are to be combined together, the machine commences by effacing them from the columns where they are written, that is, it places zero on every disc of the two vertical lines on which the numbers were represented; and it transfers the numbers to the mill. There, the apparatus having been disposed suitably for the required operation, this latter is effected, and, when completed, the result itself is transferred to the column of Variables which shall have been indicated. Thus the mill is that portion of the machine which works, and the columns of Variables constitute that where the results are represented and arranged. After the preceding explanations, we may perceive that all fractional and irrational results will be represented in decimal fractions. Supposing each column to have forty discs, this extension will be sufficient for all degrees of approximation generally required.
现在我们要问的是,机器如何能够在不借助人手的情况下,自行采取适合操作的连续配置。这个问题的解决方案是从用于制造织锦织物的提花设备中得到的,方法如下:
It will now be inquired how the machine can of itself, and without having recourse to the hand of man, assume the successive dispositions suited to the operations. The solution of this problem has been taken from Jacquard’s apparatus, used for the manufacture of brocaded stuffs, in the following manner:—
机织物中通常有两种线:一种是经纱或纵向纱线,另一种是纬纱或横向纱线,由称为梭子的仪器输送,并与纵向纱线或经纱交叉。当需要锦缎材料时,又需要防止某些线穿过纬纱,并且这根据要复制的设计的性质所确定的连续性。以前,这个过程漫长而困难,工人必须通过关注他要复制的设计,自己调节线的运动。因此,这种材料的价格很高,特别是当不同颜色的线进入织物时。为了简化制造,Jacquard 设计了连接计划每一组共同作用的线程,都有一个专属于该组的不同杠杆。所有这些杠杆的末端都是杆,这些杆结合在一起形成一束,通常具有带有矩形底座的平行六面体的形状。这些杆是圆柱形的,并且彼此间隔很小的间隔。因此,提升螺纹的过程被分解为按所需顺序移动这些不同的杠杆臂的过程。为了实现这一点,需要使用一块矩形纸板,其尺寸比杠杆臂束的一部分稍大。如果将该片材施加到该束的底部,然后将前进运动传递到纸板,则后者将随之移动该束的所有杆,并因此移动与每个杆连接的线。但是,如果粘贴板不是普通的,而是穿有与与其相接触的杠杆末端相对应的孔,那么,由于每个杠杆在后者运动期间都会穿过粘贴板,因此它们都将保持在其各自的位置。地方。因此,我们看到很容易确定纸板上孔的位置,即在任何给定时刻,都会有一定数量的杠杆,因此会有一定数量的螺纹包,而其余的则保持在它们的位置。是。假设这个过程按照要执行的模式所指示的规律连续重复,我们认为这个模式可能会在物体上再现。为此,我们只需按照法律要求组成一系列卡片,并将它们按适当的顺序依次排列即可;然后,通过使它们经过多边形梁,该多边形梁连接成对于梭子的每次行程都转动一个新的面,然后该面将被平行于其自身推靠在杠杆臂束上,升起的操作线程将定期执行。因此,我们看到织锦薄纸可以以以前难以获得的精度和速度制造。
Two species of threads are usually distinguished in woven stuffs; one is the warp or longitudinal thread, the other the woof or transverse thread, which is conveyed by the instrument called the shuttle, and which crosses the longitudinal thread or warp. When a brocaded stuff is required, it is necessary in turn to prevent certain threads from crossing the woof, and this according to a succession which is determined by the nature of the design that is to be reproduced. Formerly this process was lengthy and difficult, and it was requisite that the workman, by attending to the design which he was to copy, should himself regulate the movements the threads were to take. Thence arose the high price of this description of stuffs, especially if threads of various colours entered into the fabric. To simplify this manufacture, Jacquard devised the plan of connecting each group of threads that were to act together, with a distinct lever belonging exclusively to that group. All these levers terminate in rods, which are united together in one bundle, having usually the form of a parallelopiped with a rectangular base. The rods are cylindrical, and are separated from each other by small intervals. The process of raising the threads is thus resolved into that of moving these various lever-arms in the requisite order. To effect this, a rectangular sheet of pasteboard is taken, somewhat larger in size than a section of the bundle of lever-arms. If this sheet be applied to the base of the bundle, and an advancing motion be then communicated to the pasteboard, this latter will move with it all the rods of the bundle, and consequently the threads that are connected with each of them. But if the pasteboard, instead of being plain, were pierced with holes corresponding to the extremities of the levers which meet it, then, since each of the levers would pass through the pasteboard during the motion of the latter, they would all remain in their places. We thus see that it is easy so to determine the position of the holes in the pasteboard, that, at any given moment, there shall be a certain number of levers, and consequently of parcels of threads, raised, while the rest remain where they were. Supposing this process is successively repeated according to a law indicated by the pattern to be executed, we perceive that this pattern may be reproduced on the stuff. For this purpose we need merely compose a series of cards according to the law required, and arrange them in suitable order one after the other; then, by causing them to pass over a polygonal beam which is so connected as to turn a new face for every stroke of the shuttle, which face shall then be impelled parallelly to itself against the bundle of lever-arms, the operation of raising the threads will be regularly performed. Thus we see that brocaded tissues may be manufactured with a precision and rapidity formerly difficult to obtain.
与刚刚描述的类似的布置已被引入分析引擎中。它包含两种主要类型的卡:第一是操作卡,通过它可以对机器的各个部分进行处理,以执行任何确定的一系列操作,例如加法、减法、乘法和除法;其次,变量卡,它向机器指示要表示结果的列。当卡片开始运动时,根据要实现的过程的性质连续地安排机器的各个部分,并且机器同时通过其所包含的各个机构来执行这些过程。被构成。
Arrangements analogous to those just described have been introduced into the Analytical Engine. It contains two principal species of cards: first, Operation cards, by means of which the parts of the machine are so disposed as to execute any determinate series of operations, such as additions, subtractions, multiplications, and divisions; secondly, cards of the Variables, which indicate to the machine the columns on which the results are to be represented. The cards, when put in motion, successively arrange the various portions of the machine according to the nature of the processes that are to be effected, and the machine at the same time executes these processes by means of the various pieces of mechanism of which it is constituted.
为了更完美地理解这个问题,让我们选择两个具有两个未知量的一次方程的解作为例子。设以下两个方程,其中x和y为未知量:—
In order more perfectly to conceive the thing, let us select as an example the resolution of two equations of the first degree with two unknown quantities. Let the following be the two equations, in which x and y are the unknown quantities:—
我们推导出, 和y的类似表达式。让我们继续用V 0、V 1、V 2等来表示。包含数字的不同列,让我们假设已选择前八列来表达m、n、d、m '所表示的数字,n ′、d ′、n和n ′,这意味着V 0 = m、V 1 = n、V 2 = d、V 3 = m ′、V 4 = n ′、V 5 = d ′、V 6 = n,V 7 = n ′。
We deduce , and for y an analogous expression. Let us continue to represent by V0, V1, V2, &c. the different columns which contain the numbers, and let us suppose that the first eight columns have been chosen for expressing on them the numbers represented by m, n, d, m′, n′, d′, n and n′, which implies that V0 = m, V1 = n, V2 = d, V3 = m′, V4 = n′, V5 = d′, V6 = n, V7 = n′.
卡命令的一系列操作以及获得的结果可以如图 3.1所示。
The series of operations commanded by the cards, and the results obtained, may be represented in Figure 3.1.
……我们可以从这些解释中推断出以下重要的结论,即。由于卡片仅指示要执行的操作的性质以及要执行操作的变量列,因此这些卡片本身将具有分析的所有一般性,而它们实际上只是分析的翻译。现在我们将进一步研究机器必须克服的一些困难,如果它要完成分析。有些函数在经过零或无穷大时,其性质必然会发生变化,或者当它们经过这些极限时,其值不能被接受。当出现这种情况时,机器能够通过铃声发出通知,表明正在发生通过零或无穷大的过程,然后它会停止,直到服务员再次将其设置为接下来可能进行的任何过程。希望它能够执行。如果这个过程已经被预见到,那么机器将不会响铃,而是将自己安排为呈现与随后通过零和无穷大的操作相关的新卡片。这些新牌可能会在第一张牌之后出现,但只能在刚刚提到的两种情况中的一种或另一种发生时才发挥作用。
…We may deduce the following important consequence from these explanations, viz. that since the cards only indicate the nature of the operations to be performed, and the columns of Variables with which they are to be executed, these cards will themselves possess all the generality of analysis, of which they are in fact merely a translation. We shall now further examine some of the difficulties which the machine must surmount, if its assimilation to analysis is to be complete. There are certain functions which necessarily change in nature when they pass through zero or infinity, or whose values cannot be admitted when they pass these limits. When such cases present themselves, the machine is able, by means of a bell, to give notice that the passage through zero or infinity is taking place, and it then stops until the attendant has again set it in action for whatever process it may next be desired that it shall perform. If this process has been foreseen, then the machine, instead of ringing, will so dispose itself as to present the new cards which have relation to the operation that is to succeed the passage through zero and infinity. These new cards may follow the first, but may only come into play contingently upon one or other of the two circumstances just mentioned taking place.
让我们考虑ab n形式的项;由于卡片只是分析公式的翻译,因此在这种特殊情况下它们的数量必须相同,无论n的值是多少;也就是说,无论将b提升到n次方所需的乘法次数是多少(我们暂时假设n是整数)。现在,由于指数n表示由于b要与自身相乘n次,并且所有这些操作都具有相同的性质,因此使用一张操作卡就足够了,即。命令乘法的那个。……
Let us consider a term of the form abn; since the cards are but a translation of the analytical formula, their number in this particular case must be the same, whatever be the value of n; that is to say, whatever be the number of multiplications required for elevating b to the nth power (we are supposing for the moment that n is a whole number). Now, since the exponent n indicates that b is to be multiplied n times by itself, and all these operations are of the same nature, it will be sufficient to employ one single operation-card, viz. that which orders the multiplication. …
回顾我们对分析机的解释,我们可以得出结论,它基于两个原则:第一个原则是每个算术计算最终都取决于四个主要运算——加法、减法、乘法和除法;第二,可以将每个分析计算简化为级数多项的系数计算。如果最后一个原则成立,则所有分析操作都在引擎的范围内。换个角度来看:卡片的使用提供了与代数公式相同的一般性,因为这样的公式只是表明了达到某个确定结果所需的运算的性质和顺序,同样,卡片只是命令引擎执行这些相同的操作;但为了使这些机制能够达到任何目的,必须在每个特定情况下引入问题的数字数据。因此,同一系列的卡片将适用于所有性质相同的问题,除了数字数据之外不需要任何改变。从这个角度来看,卡片仅仅是代数公式的翻译,或者,更好地表达它,是另一种形式的分析符号。
Resuming what we have explained concerning the Analytical Engine, we may conclude that it is based on two principles: the first consisting in the fact that every arithmetical calculation ultimately depends on four principal operations—addition, subtraction, multiplication, and division; the second, in the possibility of reducing every analytical calculation to that of the coefficients for the several terms of a series. If this last principle be true, all the operations of analysis come within the domain of the engine. To take another point of view: the use of the cards offers a generality equal to that of algebraical formulæ, since such a formula simply indicates the nature and order of the operations requisite for arriving at a certain definite result, and similarly the cards merely command the engine to perform these same operations; but in order that the mechanisms may be able to act to any purpose, the numerical data of the problem must in every particular case be introduced. Thus the same series of cards will serve for all questions whose sameness of nature is such as to require nothing altered excepting the numerical data. In this light the cards are merely a translation of algebraical formulæ, or, to express it better, another form of analytical notation.
由于发动机具有其自身特有的工作模式,因此在每种特定情况下都必须根据机器所拥有的装置来安排一系列计算;对于这样或那样的过程,对于计算器来说可能非常容易,对于引擎来说可能很长且复杂,反之亦然。
Since the engine has a mode of acting peculiar to itself, it will in every particular case be necessary to arrange the series of calculations conformably to the means which the machine possesses; for such or such a process which might be very easy for a calculator may be long and complicated for the engine, and vice versa.
从最一般的角度考虑,机器的基本目标是根据规定的定律计算数值系数的值,然后将其适当地分布在表示变量的列上,由此得出:对公式和结果的解释超出了它的范围,除非这种解释本身确实可以通过机器使用的符号来表达。因此,尽管它本身不是反映的存在,但它仍然可以被视为执行智能概念的存在。这些卡片接收到这些概念的印记,并将其动作所需的命令传输到组成引擎的各种机制。当引擎建成后,制作卡片的难度就会降低;但由于这些仅仅是代数公式的翻译,因此,通过一些简单的符号,就可以很容易地将它们委托给工人来执行。因此,整个智力劳动将仅限于公式的准备,该公式必须适合引擎的计算。
Considered under the most general point of view, the essential object of the machine being to calculate, according to the laws dictated to it, the values of numerical coefficients which it is then to distribute appropriately on the columns which represent the variables, it follows that the interpretation of formulæ and of results is beyond its province, unless indeed this very interpretation be itself susceptible of expression by means of the symbols which the machine employs. Thus, although it is not itself the being that reflects, it may yet be considered as the being which executes the conceptions of intelligence. The cards receive the impress of these conceptions, and transmit to the various trains of mechanism composing the engine the orders necessary for their action. When once the engine shall have been constructed, the difficulty will be reduced to the making out of the cards; but as these are merely the translation of algebraical formulæ, it will, by means of some simple notations, be easy to consign the execution of them to a workman. Thus the whole intellectual labour will be limited to the preparation of the formulæ, which must be adapted for calculation by the engine.
现在,承认可以制造这样的发动机,人们可能会问:它的效用是什么?重述;具有以下优点:一是精度严格。我们知道,数值计算通常是解决问题的绊脚石,因为错误很容易出现,而且检测这些错误并不总是那么容易。现在,引擎根据其工作模式的本质,在其运行过程中不需要人工干预,在正确性的基础上提供了各种安全性:此外,它还带有自己的支票;因为在每次操作结束时,它不仅会打印结果,还会打印问题的数值数据;以便于验证问题是否被正确提出。其次,节省时间:为了让自己相信这一点,我们只需要记住,两个数字(每个数字由二十位数字组成)的乘法最多需要三分钟。同样,当需要进行一长串相同的计算时,例如需要形成数值表格时,可以使用机器来同时给出多个结果,这将大大减少整体的计算量。的过程。第三,智力的经济性:简单的算术计算需要由具有一定能力的人来完成;当我们进行更复杂的计算并希望在特定情况下使用代数公式时,必须具备以某种程度的初步数学研究为前提的知识。现在,由于发动机能够自行执行所有这些纯粹的物质操作,因此节省了智力劳动,而智力劳动可能会得到更有利的利用。因此,引擎可以被认为是一个真正的数字制造厂,它将为许多依赖于数字的有用科学和艺术提供帮助。再说一次,谁能预见这样一项发明的后果呢?事实上,有多少宝贵的观察对于科学的进步来说实际上是贫瘠的,因为没有足够的力量来计算结果!一个天才的头脑需要时间专门用于冥想,而他眼睁睁地看着这些时间被物质的日常运作所夺走,漫长而枯燥的计算的视角给他的心灵带来了多大的沮丧啊!然而,他必须通过艰苦的分析途径才能得出真理。但除非以数字为指导,否则他无法实现这一目标。因为没有数字,我们就无法揭开自然奥秘的面纱。因此,构建一种能够在此类研究中帮助人类弱点的装置的想法,是一个构想,如果实现,将标志着科学史上的一个辉煌时代。组成这个巨大装置的所有不同部件和所有轮子工作都已制定了计划,并研究了它们的作用;但这些尚未在图纸和机械符号中完全结合在一起。巴贝奇先生的天才必须激发人们的信心,为这项事业取得成功的希望提供了合理的理由。在我们向指导这一事业的智慧致敬的同时,让我们表达对完成这一事业的渴望。
Now, admitting that such an engine can be constructed, it may be inquired: what will be its utility? To recapitulate; it will afford the following advantages:—First, rigid accuracy. We know that numerical calculations are generally the stumbling-block to the solution of problems, since errors easily creep into them, and it is by no means always easy to detect these errors. Now the engine, by the very nature of its mode of acting, which requires no human intervention during the course of its operations, presents every species of security under the head of correctness: besides, it carries with it its own check; for at the end of every operation it prints off, not only the results, but likewise the numerical data of the question; so that it is easy to verify whether the question has been correctly proposed. Secondly, economy of time: to convince ourselves of this, we need only recollect that the multiplication of two numbers, consisting each of twenty figures, requires at the very utmost three minutes. Likewise, when a long series of identical computations is to be performed, such as those required for the formation of numerical tables, the machine can be brought into play so as to give several results at the same time, which will greatly abridge the whole amount of the processes. Thirdly, economy of intelligence: a simple arithmetical computation requires to be performed by a person possessing some capacity; and when we pass to more complicated calculations, and wish to use algebraical formulæ in particular cases, knowledge must be possessed which presupposes preliminary mathematical studies of some extent. Now the engine, from its capability of performing by itself all these purely material operations, spares intellectual labour, which may be more profitably employed. Thus the engine may be considered as a real manufactory of figures, which will lend its aid to those many useful sciences and arts that depend on numbers. Again, who can foresee the consequences of such an invention? In truth, how many precious observations remain practically barren for the progress of the sciences, because there are not powers sufficient for computing the results! And what discouragement does the perspective of a long and arid computation cast into the mind of a man of genius, who demands time exclusively for meditation, and who beholds it snatched from him by the material routine of operations! Yet it is by the laborious route of analysis that he must reach truth; but he cannot pursue this unless guided by numbers; for without numbers it is not given us to raise the veil which envelopes the mysteries of nature. Thus the idea of constructing an apparatus capable of aiding human weakness in such researches, is a conception which, being realized, would mark a glorious epoch in the history of the sciences. The plans have been arranged for all the various parts, and for all the wheel-work, which compose this immense apparatus, and their action studied; but these have not yet been fully combined together in the drawings and mechanical notation. The confidence which the genius of Mr. Babbage must inspire, affords legitimate ground for hope that this enterprise will be crowned with success; and while we render homage to the intelligence which directs it, let us breathe aspirations for the accomplishment of such an undertaking.
差值引擎被构建用于制表的特定函数是Δ 7 u z = 0。该引擎经过专门设计和改造以实现的目的是计算航海和天文表。Δ 7 u z = 0的积分为
The particular function whose integral the Difference Engine was constructed to tabulate, is Δ7uz = 0. The purpose which that engine has been specially intended and adapted to fulfil, is the computation of nautical and astronomical tables. The integral of Δ7uz = 0 being
常数a、b、c、 &c 。代表发动机组成的七列圆盘。因此,它可以在无限范围内准确地列出所有具有一般性质的系列。项包含在上式中;并且它还可以在较大或较小范围的间隔之间近似地制表,所有其他系列都能够通过差异法制表。
the constants a, b, c, &c. are represented on the seven columns of discs, of which the engine consists. It can therefore tabulate accurately and to an unlimited extent, all series whose general term is comprised in the above formula; and it can also tabulate approximatively between intervals of greater or less extent, all other series which are capable of tabulation by the Method of Differences.
相反,分析引擎不仅适用于将一个特定函数的结果制成表格,而且不适用于将其他函数的结果制成表格,而且还适用于开发和表格化任何函数。事实上,引擎可以被描述为任何通用性和复杂性程度的任何不定函数的物质表达,例如F ( x, y, z, log x, sin y, x p等),将会观察到,这是任意数量的所有其他可能函数的函数。
The Analytical Engine, on the contrary, is not merely adapted for tabulating the results of one particular function and of no other, but for developing and tabulating any function whatever. In fact the engine may be described as being the material expression of any indefinite function of any degree of generality and complexity, such as for instance, F(x, y, z, log x, sin y, xp, &c.), which is, it will be observed, a function of all other possible functions of any number of quantities.
在这种情况下,我们可以将其称为发动机的中性或零状态,它随时准备通过构成其机构一部分的卡来接收(并应用在提花织机中使用的原理) ,我们可能希望开发或制表的任何特殊功能的印象。这些卡片本身(以回忆录本身所解释的方式)包含可能正在考虑的特定功能的发展规律,并且它们迫使机制以某种相应的顺序相应地行动。例如,最简单的情况之一是假设F ( x, y, z, &c. &c.) 是特定函数Δ n u z = 0,差分机将n值制表为最多 7。在这种情况下,卡片将命令该机制执行一系列操作,这些操作将制成表格
In this, which we may call the neutral or zero state of the engine, it is ready to receive at any moment, by means of cards constituting a portion of its mechanism (and applied on the principle of those used in the Jacquard-loom), the impress of whatever special function we may desire to develop or to tabulate. These cards contain within themselves (in a manner explained in the Memoir itself) the law of development of the particular function that may be under consideration, and they compel the mechanism to act accordingly in a certain corresponding order. One of the simplest cases would be for example, to suppose that F(x, y, z, &c. &c.) is the particular function Δnuz = 0 which the Difference Engine tabulates for values of n only up to 7. In this case the cards would order the mechanism to go through that succession of operations which would tabulate
其中n可以是任何数字。
where n might be any number whatever.
然而,这些卡与特定数值数据的规定无关。它们仅仅确定要实现的运算,这些运算当然可以对无限多种特定数值执行,并且不会产生任何明确的数值结果,除非问题的数值数据已被印在问题的必要部分上。机制的列车。在上面的示例中,获得算术结果的第一个重要步骤是用特定数字替换n以及进入函数的其他原始量。……
These cards, however, have nothing to do with the regulation of the particular numerical data. They merely determine the operations to be effected, which operations may of course be performed on an infinite variety of particular numerical values, and do not bring out any definite numerical results unless the numerical data of the problem have been impressed on the requisite portions of the train of mechanism. In the above example, the first essential step towards an arithmetical result would be the substitution of specific numbers for n, and for the other primitive quantities which enter into the function. …
操作机制甚至可以独立于任何要操作的对象而投入运行(尽管当然不会产生任何结果)。同样,它可能会作用于除数字之外的其他事物,如果发现的物体的相互基本关系可以通过抽象操作科学的关系来表达,并且也应该易于适应发动机的操作符号和机制的作用。 。例如,假设和声科学和音乐创作中音高的基本关系容易受到这种表达和适应的影响,那么引擎就可以创作出任何复杂程度或程度的复杂而科学的音乐作品。……
The operating mechanism can even be thrown into action independently of any object to operate upon (although of course no result could then be developed). Again, it might act upon other things besides number, were objects found whose mutual fundamental relations could be expressed by those of the abstract science of operations, and which should be also susceptible of adaptations to the action of the operating notation and mechanism of the engine. Supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and of musical composition were susceptible of such expression and adaptations, the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent. …
分析机的显着特征,以及使得赋予机械装置如此广泛的能力成为可能,使该机成为抽象代数的执行右手,是引入了 Jacquard 为之设计的原理。通过打孔卡调节制造过程中最复杂的图案织锦的东西。这就是这两种发动机的区别所在。差异引擎中不存在任何此类内容。我们可以最贴切地说,分析机编织代数图案就像提花织机编织花朵和叶子一样。在我们看来,这里的独创性远远超出了差分机所声称的。我们不想否认后者的所有此类主张。我们相信,这是迄今为止唯一的提议或尝试,旨在构建一种基于连续差分顺序原理的计算机,并且能够打印出自己的结果;该引擎超越了它的前辈,无论是在它可以执行的计算范围方面,还是在它可以影响计算的便利性、确定性和准确性方面,以及在执行过程中不需要人类智能干预方面其计算。然而,它的本质仅限于严格的算术,而且它远不是第一个或唯一一个或多或少成功地构造算术计算机的方案。
The distinctive characteristic of the Analytical Engine, and that which has rendered it possible to endow mechanism with such extensive faculties as bid fair to make this engine the executive right-hand of abstract algebra, is the introduction into it of the principle which Jacquard devised for regulating, by means of punched cards, the most complicated patterns in the fabrication of brocaded stuffs. It is in this that the distinction between the two engines lies. Nothing of the sort exists in the Difference Engine. We may say most aptly, that the Analytical Engine weaves algebraical patterns just as the Jacquard-loom weaves flowers and leaves. Here, it seems to us, resides much more of originality than the Difference Engine can be fairly entitled to claim. We do not wish to deny to this latter all such claims. We believe that it is the only proposal or attempt ever made to construct a calculating machine founded on the principle of successive orders of differences, and capable of printing off its own results; and that this engine surpasses its predecessors, both in the extent of the calculations which it can perform, in the facility, certainty and accuracy with which it can effect them, and in the absence of all necessity for the intervention of human intelligence during the performance of its calculations. Its nature is, however, limited to the strictly arithmetical, and it is far from being the first or only scheme for constructing arithmetical calculating machines with more or less of success.
然而,当应用卡片的想法出现时,算术的界限就被超越了。分析机与单纯的“计算机器”并不存在共同点。它完全拥有自己的地位;它所提出的考虑因素本质上是最有趣的。在使机制能够将无限种类和范围的通用符号连续组合在一起的过程中,在物质的操作和数学科学最抽象分支的抽象心理过程之间建立了统一的联系。一种新的、广泛的、强大的语言被开发出来,用于未来的分析,在其中运用它的真理,以便这些真理可以比我们迄今为止所拥有的手段更快、更准确地实际应用于人类的目的。使成为可能。因此,不仅精神和物质,而且数学世界中的理论和实践,都彼此建立了更加密切和有效的联系。我们不知道有记录表明,到目前为止,任何与分析机的性质有关的东西都已被提出,甚至被认为是一种实际的可能性,只不过是一种思考或一种推理机。……
The bounds of arithmetic were however outstepped the moment the idea of applying the cards had occurred; and the Analytical Engine does not occupy common ground with mere “calculating machines.” It holds a position wholly its own; and the considerations it suggests are most interesting in their nature. In enabling mechanism to combine together general symbols in successions of unlimited variety and extent, a uniting link is established between the operations of matter and the abstract mental processes of the most abstract branch of mathematical science. A new, a vast, and a powerful language is developed for the future use of analysis, in which to wield its truths so that these may become of more speedy and accurate practical application for the purposes of mankind than the means hitherto in our possession have rendered possible. Thus not only the mental and the material, but the theoretical and the practical in the mathematical world, are brought into more intimate and effective connexion with each other. We are not aware of its being on record that anything partaking in the nature of what is so well designated the Analytical Engine has been hitherto proposed, or even thought of, as a practical possibility, any more than the idea of a thinking or of a reasoning machine. …
那些倾向于非常严格的功利主义观点的人可能会觉得分析引擎的特殊能力涉及抽象和思辨科学的问题,而不是涉及日常和普通人类利益的问题。这些人可能对他们认为没有用的任何科学分支(根据他们对这个词的定义)缺乏同情心或可能不了解,可能会认为该发动机的任务,既然另一个发动机如果一项工作已经在进行中,那么投入更多的金钱和劳动力将是一种贫瘠且毫无成效的工作;事实上,这是一部过度劳作的作品。然而,即使在功利方面,我们也不怀疑分析机的扩展能力将产生非常有价值的实际结果;如果我们有空间的话,我们认为我们现在可以暗示其中的一些结果;以及其他一些,虽然目前还无法预见,但随着科学要求的日益增加,以及对发动机功率的更深入的实际了解,如果它确实存在的话,将会带来这些。……
Those who incline to very strictly utilitarian views may perhaps feel that the peculiar powers of the Analytical Engine bear upon questions of abstract and speculative science, rather than upon those involving every-day and ordinary human interests. These persons being likely to possess but little sympathy, or possibly acquaintance, with any branches of science which they do not find to be useful (according to their definition of that word), may conceive that the undertaking of that engine, now that the other one is already in progress, would be a barren and unproductive laying out of yet more money and labour; in fact, a work of supererogation. Even in the utilitarian aspect, however, we do not doubt that very valuable practical results would be developed by the extended faculties of the Analytical Engine; some of which results we think we could now hint at, had we the space; and others, which it may not yet be possible to foresee, but which would be brought forth by the daily increasing requirements of science, and by a more intimate practical acquaintance with the powers of the engine, were it in actual existence. …
亚洲航空协会
A. A. L.
这里提到的分析机的那部分称为仓库。它包含由 M. Menabrea 描述的无限数量的圆盘列。读者可能会想象出一堆相当大的绘图员,一个一个地垂直堆积到相当高的高度,每个计数器的边缘都以相等的间隔刻有从 0 到 9 的数字;如果他认为计数器实际上并不是一个一个地叠放在另一个上以便接触,而是以很小的垂直距离固定在一根垂直穿过计数器中心的公共轴线上,并且每个圆盘都可以围绕该轴线水平旋转以便可以看到其边缘上刻写的任何所需数字,他将对这些列之一有很好的了解。任何一列上最低的圆盘属于个位,上面的下一个属于十位,再上面的下一个属于百位,依此类推。因此,如果我们想在发动机的一根柱子上刻上 1345,它会这样写:
That portion of the Analytical Engine here alluded to is called the storehouse. It contains an indefinite number of the columns of discs described by M. Menabrea. The reader may picture to himself a pile of rather large draughtsmen heaped perpendicularly one above another to a considerable height, each counter having the digits from 0 to 9 inscribed on its edge at equal intervals; and if he then conceives that the counters do not actually lie one upon another so as to be in contact, but are fixed at small intervals of vertical distance on a common axis which passes perpendicularly through their centres, and around which each disc can revolve horizontally so that any required digit amongst those inscribed on its margin can be brought into view, he will have a good idea of one of these columns. The lowest of the discs on any column belongs to the units, the next above to the tens, the next above this to the hundreds, and so on. Thus, if we wished to inscribe 1345 on a column of the engine, it would stand thus:—
在差分机中,有七根这样的柱子并排排成一排,工作机构在它们后面延伸:整个机械体的一般形状是四棱柱(或多或少接近立方体) ; 结果总是出现在发动机的垂直面上,该垂直面上包含圆盘列,与观众可能放置的面相对。在分析引擎中会有更多这样的列,可能至少有两百个。其整个机构所采用的精确形式和布置尚未最终确定。
In the Difference Engine there are seven of these columns placed side by side in a row, and the working mechanism extends behind them: the general form of the whole mass of machinery is that of a quadrangular prism (more or less approaching to the cube); the results always appearing on that perpendicular face of the engine which contains the columns of discs, opposite to which face a spectator may place himself. In the Analytical Engine there would be many more of these columns, probably at least two hundred. The precise form and arrangement which the whole mass of its mechanism will assume is not yet finally determined.
我们可以方便地在纸上用如图3.2所示的图表来表示圆盘的列。
We may conveniently represent the columns of discs on paper in a diagram like Figure 3.2.
V是为了方便参考任何专栏,无论是书面还是口头形式,因此都进行了编号。之所以选择字母V而不是任何其他字母,是因为这些列被指定(读者在阅读回忆录时会发现)变量,有时是变量列,或变量列。这个名称的由来是,柱子上的值注定会发生变化,即以各种可以想象的方式发生变化。但有必要防止自然的误解,即列仅用于接收分析中的变量值。公式,而不是常数。这些列被称为变量,其理由与常量和变量之间的分析区别完全无关。为了防止混淆,我们在翻译和注释中,当我们使用这个词来表示引擎的列时,将变量写为大写字母,而当我们表示引擎的列时,将变量写为小写字母。公式的变量。类似地,变量卡表示属于引擎列的任何卡。
The V’s are for the purpose of convenient reference to any column, either in writing or speaking, and are consequently numbered. The reason why the letter V is chosen for the purpose in preference to any other letter, is because these columns are designated (as the reader will find in proceeding with the Memoir) the Variables, and sometimes the Variable columns, or the columns of Variables. The origin of this appellation is, that the values on the columns are destined to change, that is to vary, in every conceivable manner. But it is necessary to guard against the natural misapprehension that the columns are only intended to receive the values of the variables in an analytical formula, and not of the constants. The columns are called Variables on a ground wholly unconnected with the analytical distinction between constants and variables. In order to prevent the possibility of confusion, we have, both in the translation and in the notes, written Variable with a capital letter when we use the word to signify a column of the engine, and variable with a small letter when we mean the variable of a formula. Similarly, Variable-cards signify any cards that belong to a column of the engine.
回到图表的解释:顶部的每个圆圈都包含代数符号+或-,其中一个可以替换另一个,具体取决于下面一列中表示的数字是正数还是负数。以类似的方式,代数过程的任何其他纯符号结果都可以出现在这些圆圈中。在注释 A 中,谈到了开发符号结果的可行性,其难度并不亚于数值结果。符号圆圈下方的零代表每个圆盘,前面应该有数字 0。图中仅示出了四层零,但是这些可以被认为代表三十或四十,或者可能需要的任何数量的盘层。由于每个圆盘可以表示任意数字,每个圆盘可以表示任意符号,所以每一列的圆盘都可以在机器的限制范围内调整以表示任何正数或负数;这些限制取决于机构的垂直范围,即取决于一列的圆盘数量。
To return to the explanation of the diagram: each circle at the top is intended to contain the algebraic sign + or −, either of which can be substituted for the other, according as the number represented on the column below is positive or negative. In a similar manner any other purely symbolical results of algebraical processes might be made to appear in these circles. In Note A, the practicability of developing symbolical with no less ease than numerical results has been touched on. The zeros beneath the symbolic circles represent each of them a disc, supposed to have the digit 0 presented in front. Only four tiers of zeros have been figured in the diagram, but these may be considered as representing thirty or forty, or any number of tiers of discs that may be required. Since each disc can present any digit, and each circle any sign, the discs of every column may be so adjusted as to express any positive or negative number whatever within the limits of the machine; which limits depend on the perpendicular extent of the mechanism, that is, on the number of discs to a column.
零下方的每个方块均用于刻录我们喜欢的任何通用符号或符号组合;应当理解,紧接上面的列上表示的数字是该符号或符号组合的数值。例如,让我们表示三个量a、n、x,并进一步假设a = 5、n = 7、x = 98。我们应该有图 3.3。
Each of the squares below the zeros is intended for the inscription of any general symbol or combination of symbols we please; it being understood that the number represented on the column immediately above is the numerical value of that symbol, or combination of symbols. Let us, for instance, represent the three quantities a, n, x, and let us further suppose that a = 5, n = 7, x = 98. We should have Figure 3.3.
现在我们可以以各种方式组合这些符号,以便形成它们的任何所需的一个或多个函数,然后我们可以将每个这样的函数写在括号下面,每个括号将那些进入功能写在其下方。当我们决定了其数值的特定函数时,我们还必须如果想要计算,请在右侧指定另一列以接收结果,并且必须将函数写在该列下方的方框中。在上面的例子中,我们可能有以下任何一个功能:
We may now combine these symbols in a variety of ways, so as to form any required function or functions of them, and we may then inscribe each such function below brackets, every bracket uniting together those quantities (and those only) which enter into the function inscribed below it. We must also, when we have decided on the particular function whose numerical value we desire to calculate, assign another column to the right-hand for receiving the results, and must inscribe the function in the square below this column. In the above instance we might have any one of the following functions:—
让我们选择第一个。在计算之前,情况如下(图 3.4)。给出数据后,我们现在必须将适当的卡放入引擎中,以在所选的特定功能的情况下指导操作。在这种情况下,这些操作将是——
Let us select the first. It would stand as follows, previous to calculation (Figure 3.4). The data being given, we must now put into the engine the cards proper for directing the operations in the case of the particular function chosen. These operations would in this instance be,—
首先,进行六次乘法以获得x n (对于上述特定数据,= 98 7 )。
First, six multiplications in order to get xn ( = 987 for the above particular data).
其次,进行一次乘法即可得到a · x n (= 5·98 7 )。
Secondly, one multiplication in order then to get a · xn ( = 5 · 987).
总共需要七次乘法来完成整个过程。因此,我们可以代表他们:——
In all, seven multiplications to complete the whole process. We may thus represent them:—
然而,在解决问题的连续阶段,乘法将对来自不同列的数字对进行运算。也就是说,相同的操作会针对不同的操作主体进行。这里再次说明前面注释中关于发动机指导其操作的独立方式的评论。在确定ax n的值时,操作是同类的,但在计算的连续阶段分布在不同的操作主体之间。正是通过属于变量本身的某些穿孔卡,操作的动作被分布以适应每个特定的功能。操作卡仅以一般方式确定操作的连续性。事实上,他们将磨机中包含的机制的所有部分置于一系列不同的状态,我们可以将其称为添加状态或乘法状态等。分别。在这些状态中的每一个中,该机制都准备好以该状态特有的方式对可能被允许进入其作用范围内的任何一对数字起作用。磨机一次只能存在其中一种运行状态;该机制的性质也是这样的:一次只能接收一对数字并对其进行操作。现在,为了确保工厂连续不断地接收正确的数字对,并且正确地定位对任何数字对执行的操作的结果,每个变量都有属于它自己的卡。首先,它有一类卡,其作用是允许变量上的数字进入工厂,并在那里进行操作。这些卡可以称为供应卡。他们为磨坊提供适当的食物。其次,每个变量都有另一类卡,其作用是允许变量从工厂接收数字。这些卡可以称为接收卡。它们规定结果的位置,无论是临时结果还是最终结果。在我们看来,一般的可变卡(包括前面的两类)可能更适合被称为分配卡,因为正是通过它们的手段,操作的动作以及该动作的结果,分布正确。
The multiplications would, however, at successive stages in the solution of the problem, operate on pairs of numbers, derived from different columns. In other words, the same operation would be performed on different subjects of operation. And here again is an illustration of the remarks made in the preceding Note on the independent manner in which the engine directs its operations. In determining the value of axn, the operations are homogeneous, but are distributed amongst different subjects of operation, at successive stages of the computation. It is by means of certain punched cards, belonging to the Variables themselves, that the action of the operations is so distributed as to suit each particular function. The Operation-cards merely determine the succession of operations in a general manner. They in fact throw all that portion of the mechanism included in the mill into a series of different states, which we may call the adding state, or the multiplying state, &c. respectively. In each of these states the mechanism is ready to act in the way peculiar to that state, on any pair of numbers which may be permitted to come within its sphere of action. Only one of these operating states of the mill can exist at a time; and the nature of the mechanism is also such that only one pair of numbers can be received and acted on at a time. Now, in order to secure that the mill shall receive a constant supply of the proper pairs of numbers in succession, and that it shall also rightly locate the result of an operation performed upon any pair, each Variable has cards of its own belonging to it. It has, first, a class of cards whose business it is to allow the number on the Variable to pass into the mill, there to be operated upon. These cards may be called the Supplying-cards. They furnish the mill with its proper food. Each Variable has, secondly, another class of cards, whose office it is to allow the Variable to receive a number from the mill. These cards may be called the Receiving-cards. They regulate the location of results, whether temporary or ultimate results. The Variable-cards in general (including both the preceding classes) might, it appears to us, be even more appropriately designated the Distributive-cards, since it is through their means that the action of the operations, and the results of this action, are rightly distributed.
供应变量卡有两种类型,分别适合于实现两个不同的辅助目的:但由于这些修改与当前主题无关,我们将在另一个地方注意到它们。
There are two varieties of the Supplying Variable-cards, respectively adapted for fulfilling two distinct subsidiary purposes: but as these modifications do not bear upon the present subject, we shall notice them in another place.
在上述ax n的情况下,操作卡仅命令七次乘法,也就是说,它们命令磨机连续七次处于乘法状态(不涉及要对其数字进行操作的特定列)。正确的分配变量卡在每次连续的乘法中介入,并导致特定情况所需的分配(图 3.5)。
In the above case of axn, the Operation-cards merely order seven multiplications, that is, they order the mill to be in the multiplying state seven successive times (without any reference to the particular columns whose numbers are to be acted upon). The proper Distributive Variable-cards step in at each successive multiplication, and cause the distributions requisite for the particular case (Figure 3.5).
引擎可能会连续计算所有这些。完成ax n后,函数x an可以写在括号下而不是ax n,并且开始新的计算(新函数的适当操作和变量卡当然开始发挥作用)。结果将出现在V 5上。对于数量a、n、x的任意数量的不同函数,依此类推。在后续计算过程中,每个结果可能会永久保留在其列中,以便计算完所有函数后,它们的值将同时存在于V 4、V 5、V 6等上;或者每个结果可能(在打印出来或以任何指定方式使用后)被擦除,为其后继者让路。对于后一种排列, V 4下的正方形应该具有函数ax n、x an、anx、 &c 。陆续被铭刻在其中。……
The engine might be made to calculate all these in succession. Having completed axn, the function xan might be written under the brackets instead of axn, and a new calculation commenced (the appropriate Operation and Variable-cards for the new function of course coming into play). The results would then appear on V5. So on for any number of different functions of the quantities a, n, x. Each result might either permanently remain on its column during the succeeding calculations, so that when all the functions had been computed, their values would simultaneously exist on V4, V5, V6, &c.; or each result might (after being printed off, or used in any specified manner) be effaced, to make way for its successor. The square under V4 ought, for the latter arrangement, to have the functions axn, xan, anx, &c. successively inscribed in it. …
我们越深入地分析这样一个引擎执行其过程并获得其结果的方式,我们就越能感觉到它如何清晰地真实而公正地阐述了数学分析各个步骤的相互关系和联系;它多么清楚地将那些实际上是不同和独立的事物分开,并将那些相互依赖的事物联合起来。
The further we analyse the manner in which such an engine performs its processes and attains its results, the more we perceive how distinctly it places in a true and just light the mutual relations and connexion of the various steps of mathematical analysis; how clearly it separates those things which are in reality distinct and independent, and unites those which are mutually dependent.
亚洲航空协会
A. A. L.
那些可能希望以最有效的方式研究提花织机原理的人,即。如果要进行实际观察,只需走进阿德莱德画廊或理工学院。在这些宝贵的科学插图宝库中,织布工一直在提花织机前工作,并准备提供有关其设备的构造和作用方式的任何所需信息。拉德纳的《百科全书》中关于丝绸制造的一卷包含有关提花织机的一章,也可以参考一下。
Those who may desire to study the principles of the Jacquard-loom in the most effectual manner, viz. that of practical observation, have only to step into the Adelaide Gallery or the Polytechnic Institution. In each of these valuable repositories of scientific illustration, a weaver is constantly working at a Jacquard-loom, and is ready to give any information that may be desired as to the construction and modes of acting of his apparatus. The volume on the manufacture of silk, in Lardner’s Cyclopædia, contains a chapter on the Jacquard-loom, which may also be consulted with advantage.
然而,迄今为止在编织领域中所使用的梳理机的应用模式对于在如此多样和复杂的过程中所希望实现的所有简化来说还没有足够强大的力量,这些简化是为了实现所需要的那些过程。分析引擎的目的。设计了一种方法,根据某些法律在技术上指定支持某些组中的卡。此扩展的目的是确保将任何特定卡或卡组连续使用任意次的可能性。解决一个问题。在每种特定情况下,是否应利用该权力将取决于所考虑的问题可能需要的操作的性质。M. Menabrea 提到了这个过程,这是一个非常重要的简化。有人建议将其用于该艺术的互惠互利,虽然它本身与抽象科学领域没有明显的联系,但已证明对后者非常有价值,因为它提出了新的和抽象的原理。单一的应用领域,似乎很可能将代数组合完全置于机制的范围内,就像所有那些交叉线容易受到影响的各种复杂性一样。通过将背衬系统引入提花织机本身,应该具有对称性并遵循任何程度的规则规律的图案可以通过相对较少的卡片来编织。
The mode of application of the cards, as hitherto used in the art of weaving, was not found, however, to be sufficiently powerful for all the simplifications which it was desirable to attain in such varied and complicated processes as those required in order to fulfil the purposes of an Analytical Engine. A method was devised of what was technically designated backing the cards in certain groups according to certain laws. The object of this extension is to secure the possibility of bringing any particular card or set of cards into use any number of times successively in the solution of one problem. Whether this power shall be taken advantage of or not, in each particular instance, will depend on the nature of the operations which the problem under consideration may require. The process is alluded to by M. Menabrea, and it is a very important simplification. It has been proposed to use it for the reciprocal benefit of that art, which, while it has itself no apparent connexion with the domains of abstract science, has yet proved so valuable to the latter, in suggesting the principles which, in their new and singular field of application, seem likely to place algebraical combinations not less completely within the province of mechanism, than are all those varied intricacies of which intersecting threads are susceptible. By the introduction of the system of backing into the Jacquard-loom itself, patterns which should possess symmetry, and follow regular laws of any extent, might be woven by means of comparatively few cards.
那些了解这种织机机构的人会认识到,上述改进在实践中很容易实现,只要在必要的情况下,使悬挂有图案卡系列的棱镜向后而不是向前旋转即可;直到通过这样做,任何特定的卡牌或一组卡牌,已经履行过一次职责并以普通的规则连续传递,被带回到它在前一次使用之前所占据的位置。然后棱镜恢复向前旋转,从而使所讨论的纸牌或一组纸牌第二次进入游戏。这个过程显然可以重复任意多次。
Those who understand the mechanism of this loom will perceive that the above improvement is easily effected in practice, by causing the prism over which the train of pattern-cards is suspended to revolve backwards instead of forwards, at pleasure, under the requisite circumstances; until, by so doing, any particular card, or set of cards, that has done duty once, and passed on in the ordinary regular succession, is brought back to the position it occupied just before it was used the preceding time. The prism then resumes its forward rotation, and thus brings the card or set of cards in question into play a second time. This process may obviously be repeated any number of times.
亚洲航空协会
A. A. L.
……
…
许多不熟悉数学研究的人会想象,因为引擎的任务是以数字表示法给出其结果,所以其过程的本质必然是算术和数字的,而不是代数和分析的。这是一个错误。引擎可以准确地排列和组合其数字量,就好像它们是字母或任何其他通用符号一样;事实上,只要做出相应规定,它就可以用代数符号表示结果。它可能会同时产生三组结果,即。符号结果(如注释 A 和 B 中已经提到的)、数值结果(其主要和主要目标);代数结果采用文字表示法。
Many persons who are not conversant with mathematical studies, imagine that because the business of the engine is to give its results in numerical notation, the nature of its processes must consequently be arithmetical and numerical, rather than algebraical and analytical. This is an error. The engine can arrange and combine its numerical quantities exactly as if they were letters or any other general symbols; and in fact it might bring out its results in algebraical notation, were provisions made accordingly. It might develop three sets of results simultaneously, viz. symbolic results (as already alluded to in Notes A and B), numerical results (its chief and primary object); and algebraical results in literal notation.
…操作循环…必须被理解为表示重复多次的任何一组操作。无论是只重复两次,还是无限次,都同样是一个循环;因为正是重复发生的事实才构成了这种情况。在许多分析情况下,存在一组重复出现的一个或多个循环;也就是说,一个循环的一个循环,或者一个循环的循环。……
… A cycle of operations … must be understood to signify any set of operations which is repeated more than once. It is equally a cycle, whether it be repeated twice only, or an indefinite number of times; for it is the fact of a repetition occurring at all that constitutes it such. In many cases of analysis there is a recurring group of one or more cycles; that is, a cycle of a cycle, or a cycle of cycles. …
亚洲航空协会
A. A. L.
现存一幅美丽的提花织物肖像,其制作过程需要 24,000 张卡片。
There is in existence a beautiful woven portrait of Jacquard, in the fabrication of which 24,000 cards were required.
M. Menabrea 提到了重复卡片的力量,并在注释 C 中进行了更全面的解释,它极大地减少了所需卡片的数量。显然,这种机械改进尤其适用于数学运算中出现循环的任何地方,并且在为发动机的计算准备数据时,需要安排过程的顺序和组合,以便尽可能多地获得它们尽可能对称且循环地进行,以便最大限度地发挥背衬系统的机械优势。在这里观察巧妙的机械装置满足和增强分析资源的价值的方式是很有趣的。我们在其中看到了注释A中提到的纯数学和机械部门之间相互调整的一个例子,这是计算引擎发明成功的主要和必要条件。这种调整所提供的资源的性质主要有两种。在某些情况下,一个部门的困难(也许本身是无法克服的)可以通过另一个部门的设施来克服;有时(如在本例中),通过与另一方面的相应优势相结合,一个方面的优势将变得更加强大和更可用。
The power of repeating the cards, alluded to by M. Menabrea, and more fully explained in Note C, reduces to an immense extent the number of cards required. It is obvious that this mechanical improvement is especially applicable wherever cycles occur in the mathematical operations, and that, in preparing data for calculations by the engine, it is desirable to arrange the order and combination of the processes with a view to obtain them as much as possible symmetrically and in cycles, in order that the mechanical advantages of the backing system may be applied to the utmost. It is here interesting to observe the manner in which the value of an analytical resource is met and enhanced by an ingenious mechanical contrivance. We see in it an instance of one of those mutual adjustments between the purely mathematical and the mechanical departments, mentioned in Note A, as being a main and essential condition of success in the invention of a calculating engine. The nature of the resources afforded by such adjustments would be of two principal kinds. In some cases, a difficulty (perhaps in itself insurmountable) in the one department would be overcome by facilities in the other; and sometimes (as in the present case) a strong point in the one would be rendered still stronger and more available by combination with a corresponding strong point in the other.
作为循环和支持的组合系统可以在多大程度上减少所需牌张数量的例子,我们将选择一个案例,该案例将其置于强有力的证据中,并且同样具有作为完全不同类型的优势。任何其他注释中提到的问题。假设需要从十个简单方程中消除九个变量,其形式为:
As a mere example of the degree to which the combined systems of cycles and of backing can diminish the number of cards requisite, we shall choose a case which places it in strong evidence, and which has likewise the advantage of being a perfectly different kind of problem from those that are mentioned in any of the other Notes. Suppose it be required to eliminate nine variables from ten simple equations of the form—
在继续之前,我们应该解释一下,我们的目的不是参考引擎变量上数据的实际排列来考虑这个问题,而只是将其视为所需操作的性质和数量的抽象问题。在其完整解决方案期间执行。
We should explain, before proceeding, that it is not our object to consider this problem with reference to the actual arrangement of the data on the Variables of the engine, but simply as an abstract question of the nature and number of the operations required to be performed during its complete solution.
第一步是消除前两个方程之间的第一个未知量x 0 。这可以通过以下表格获得——
The first step would be the elimination of the first unknown quantity x0 between the first two equations. This would be obtained by the form—
为此需要运算 10 (×, ×, −)。第二步是消除第二个和第三个方程之间的x 0,其操作将完全相同。那么我们总共应该进行以下操作:
for which the operations 10 (×, ×, −) would be needed. The second step would be the elimination of x0 between the second and third equations, for which the operations would be precisely the same. We should then have had altogether the following operations:—
以同样的方式继续,在所有连续的方程对之间完全消除x 0的运算总数将是——
Continuing in the same manner, the total number of operations for the complete elimination of x0 between all the successive pairs of equations would be—
然后,我们应该留下九个包含九个变量的简单方程,从中消除下一个变量x 1,其中过程的总数将是
We should then be left with nine simple equations of nine variables from which to eliminate the next variable x1, for which the total of the processes would be
然后我们应该留下八个包含八个变量的简单方程,从中消除x 2,其过程将是 -
We should then be left with eight simple equations of eight variables from which to eliminate x2, for which the processes would be—
等等。因此,消除所有变量的总操作将是——
and so on. The total operations for the elimination of all the variables would thus be—
这样,三张操作卡即可完成 330 张此类卡的工作。
So that three Operation-cards would perform the office of 330 such cards.
如果我们采用包含n − 1 个变量的n 个简单方程,其中n是一个无限大的数字,情况就变得更加明显,因为同样的三张牌可能会取代数千或数百万张牌。
If we take n simple equations containing n − 1 variables, n being a number unlimited in magnitude, the case becomes still more obvious, as the same three cards might then take the place of thousands or millions of cards.
我们现在要进一步注意已经注意到的事实,即提出的解决方案的公式并不一定需要实际制定出来,作为使引擎能够解决该问题的条件。只要我们知道要经历的一系列操作就足够了。在前面的例子中,只要稍微考虑一下,这一点就很明显了。这是一种值得特别注意的情况,因为这种发动机的潜在价值可能在其可能的最终结果中几乎无法计算。我们已经知道,有些函数的数值对于抽象科学和实用科学的确定来说都很重要,但其确定需要如此漫长和复杂的过程,尽管可以通过大量的研究来获得它们。花费时间、劳动力和金钱,但从这些角度来看,这实际上几乎是不可能实现的;我们可以想象,有些结果在实践中绝对不可能准确地获得,而这些结果的精确确定对于未来科学在其多样化、复杂和快速发展的领域中的某些需求可能非常重要的询问,到达。
We shall now draw further attention to the fact, already noticed, of its being by no means necessary that a formula proposed for solution should ever have been actually worked out, as a condition for enabling the engine to solve it. Provided we know the series of operations to be gone through, that is sufficient. In the foregoing instance this will be obvious enough on a slight consideration. And it is a circumstance which deserves particular notice, since herein may reside a latent value of such an engine almost incalculable in its possible ultimate results. We already know that there are functions whose numerical value it is of importance for the purposes both of abstract and of practical science to ascertain, but whose determination requires processes so lengthy and so complicated, that, although it is possible to arrive at them through great expenditure of time, labour and money, it is yet on these accounts practically almost unattainable; and we can conceive there being some results which it may be absolutely impossible in practice to attain with any accuracy, and whose precise determination it may prove highly important for some of the future wants of science, in its manifold, complicated and rapidly-developing fields of inquiry, to arrive at.
然而,在不进入猜想的范围的情况下,我们将提到此时此刻我们遇到的一个特定问题,它是一个恰当的说明,可以用来说明这种引擎可以用来确定人类大脑发现哪些困难或哪些困难。不可能无误地计算出来。在著名的三体问题的求解中,M. Clausen (Astro e . Nachrichten, No. 406)给出的约 295 个月球扰动系数中,根据 Burg 的计算结果,其中有两个为Damoiseau 和 Burckhardt 的一个,有 14 个系数,其代数符号的性质不同;其余的只有101(或大约三分之一)在符号和数量上都完全一致。这些不一致的个体幅度通常很小,可能是由于问题发展过程中抽象系数的错误确定,或者是由于观察推导出来的数据的差异,或者是这两种原因的结合。前者是天文计算中最常见的错误来源,而引擎将完全避免这种情况。
Without, however, stepping into the region of conjecture, we will mention a particular problem which occurs to us at this moment as being an apt illustration of the use to which such an engine may be turned for determining that which human brains find it difficult or impossible to work out unerringly. In the solution of the famous problem of the Three Bodies, there are, out of about 295 coefficients of lunar perturbations given by M. Clausen (Astroe. Nachrichten, No. 406) as the result of the calculations by Burg, of two by Damoiseau, and of one by Burckhardt, fourteen coefficients that differ in the nature of their algebraic sign; and out of the remainder there are only 101 (or about one-third) that agree precisely both in signs and in amount. These discordances, which are generally small in individual magnitude, may arise either from an erroneous determination of the abstract coefficients in the development of the problem, or from discrepancies in the data deduced from observation, or from both causes combined. The former is the most ordinary source of error in astronomical computations, and this the engine would entirely obviate.
我们甚至可以以任意方式为一系列公式发明定律,并让引擎对其进行处理,从而推导出我们可能不会想到获得的数值结果;但这在任何情况下都很难产生任何巨大的实际效用,或者被认为比作为一种哲学娱乐更重要。
We might even invent laws for series of formulæ in an arbitrary manner, and set the engine to work upon them, and thus deduce numerical results which we might not otherwise have thought of obtaining; but this would hardly perhaps in any instance be productive of any great practical utility, or calculated to rank higher than as a philosophical amusement.
亚洲航空协会
A. A. L.
最好防止对分析引擎的能力产生夸大的想法。在考虑任何新主题时,经常存在一种倾向,首先,高估我们发现已经有趣或非凡的事物;其次,高估我们认为已经有趣或非凡的事物;其次,当我们确实发现我们的观念已经超越了真正站得住脚的观念时,就会出于一种自然的反应而低估案件的真实情况。
It is desirable to guard against the possibility of exaggerated ideas that might arise as to the powers of the Analytical Engine. In considering any new subject, there is frequently a tendency, first, to overrate what we find to be already interesting or remarkable; and, secondly, by a sort of natural reaction, to undervalue the true state of the case, when we do discover that our notions have surpassed those that were really tenable.
分析引擎没有任何意图来创造任何东西。它可以做任何我们知道如何命令它执行的事情。它可以遵循分析;但它没有能力预测任何分析关系或真理。它的职责是帮助我们提供我们已经熟悉的东西。当然,它的主要目的是通过其执行能力来实现这一目标。但它很可能以另一种方式对科学本身产生间接和相互的影响。因为,通过如此分配和组合真理和分析公式,使它们可以最容易、最迅速地适应引擎的机械组合,科学中许多主题的关系和本质必然会被引入新的视角,并进行更深入的研究。这绝对是这样一项发明的间接结果,而且有些推测性的结果。然而,很明显,根据一般原则,在为数学真理设计一种新的形式来记录并投入实际使用时,很可能会引发观点,这应该再次对主题的更理论阶段产生反应。除了所达到的主要目标之外,人类力量的所有延伸或人类知识的补充,还存在各种附带影响。……
The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform. It can follow analysis; but it has no power of anticipating any analytical relations or truths. Its province is to assist us in making available what we are already acquainted with. This it is calculated to effect primarily and chiefly of course, through its executive faculties; but it is likely to exert an indirect and reciprocal influence on science itself in another manner. For, in so distributing and combining the truths and the formulæ of analysis, that they may become most easily and rapidly amenable to the mechanical combinations of the engine, the relations and the nature of many subjects in that science are necessarily thrown into new lights, and more profoundly investigated. This is a decidedly indirect, and a somewhat speculative, consequence of such an invention. It is however pretty evident, on general principles, that in devising for mathematical truths a new form in which to record and throw themselves out for actual use, views are likely to be induced, which should again react on the more theoretical phase of the subject. There are in all extensions of human power, or additions to human knowledge, various collateral influences, besides the main and primary object attained. …
亚洲航空协会
A. A. L.
重印自梅纳布雷亚 (1843)。
Reprinted from Menabrea (1843).
当乔治·布尔(George Boole,1815-1864)还是亚里士多德两千年后的学生时,《先验分析》仍然是标准的逻辑文本。十八世纪连续数学取得了惊人的进步。行星的轨道和其他运动物体的行为现在可以通过计算准确地(尽管很乏味)确定。然而,尽管莱布尼茨对思想演算的愿景与他的无穷小演算相匹配,并且他尝试性地开始发展正式的推理规则,但逻辑并不比古代更加系统化。布尔意识到逻辑是数学的一个分支,因此受到规则的约束,这些规则可以成为命题计算的基础。他的目标是“在这篇论文中用微积分的符号语言表达推理的基本定律”(§ 4.1)。
When George Boole (1815–1864) was a student two millennia after Aristotle, the Prior Analytics was still a standard logic text. Astonishing advances in continuous mathematics had been made in the eighteenth century. The orbits of the planets and the behavior of other objects in motion could now be accurately—if tediously—determined by calculation. Yet in spite of Leibniz’s vision of a calculus of ideas to match his calculus of infinitesimals, and his tentative start at developing formal rules of reasoning, logic was barely more systematic than it had been in ancient times. Boole realized that logic was a branch of mathematics, and as such was subject to rules that could be the basis for calculations about propositions. His goal was “to give expression in this treatise to the fundamental laws of reasoning in the symbolical language of a Calculus” (§4.1).
数学家乔治·皮科克 (George Peacock) 是剑桥知识界的一员,其中包括查尔斯·巴贝奇 (Charles Babbage),他在《代数论》 (Peacock,1830) 中提出了代数观点,即“通过符号语言进行一般推理的科学”,这为布尔埋下了伏笔。然而,《布尔思维定律》的出版标志着计算机科学发展的关键时刻。他将逻辑简化为真假变量的代数,影响了其他当代逻辑学家,包括奥古斯都·德·摩根(Augustus De Morgan)(今天因其定律而被人们铭记)和约翰·维恩(John Venn)(因其图表而被人们铭记)。《思想法则》出版几年后,布尔的同胞威廉·杰文斯(William Jevons,1835-1882)建造了一台“逻辑钢琴”,用于对命题的真值进行计算。它不是一个有用的装置,仅限于四个命题,但它是我们现在所知的布尔逻辑的第一个机械化。布尔的逻辑代数概念实际上超越了布尔逻辑,延伸到了我们现在所说的朴素集合论和概率论。在第 30 页,布尔将他的词“类”解释为我们现在所说的集合,并明确指出类可能是空的、单例的或整个宇宙的可能性。布尔指出,变量x可以被无差别地视为一个命题,表明个体具有给定的属性,或者作为具有该属性的一类实体。然后,他继续用“+”表示并,“−”表示差,“×”或并置表示交,并推导出交换律和分配律,并推断出对偶或二分原理(没有任何东西可以属于一个集合及其补集)从幂等性的代数角度(集合与其自身的交集是同一集合)。这本书的布局是一系列的定义、命题和“规则”,所有这些都是为了指导逻辑公式(以及后来的概率)的代数运算。
The mathematician George Peacock, a member of a Cambridge intellectual circle that included Charles Babbage, foreshadowed Boole when, in his Treatise on Algebra (Peacock, 1830), he propounded a view of algebra as “the science of general reasoning by symbolical language.” Yet the publication of Boole’s Laws of Thought marked a crucial moment in the development of computer science. His reduction of logic to an algebra of true and false variables influenced other contemporary logicians, including Augustus De Morgan (remembered today for his laws) and John Venn (remembered for his diagrams). A few years after publication of The Laws of Thought, Boole’s countryman William Jevons (1835–1882) built a “logic piano” for doing calculations with the truth-values of propositions. It wasn’t a useful device, being restricted to four propositions, but it was the first mechanization of what we now know as boolean logic. Boole’s conception of logical algebra in fact extends beyond boolean logic to what we would now call naïve set theory and probability theory. On page 30, Boole explains his word “class” to mean what we now call a set, and explicitly states the possibility that a class might be empty, a singleton, or the entire universe. Boole articulates that a variable x can be treated indifferently as a proposition stating that an individual has a given property, or as a class of entities having that property. He then goes on to use “+” for union, “−” for difference, and “×” or juxtaposition for intersection, and to derive the commutative and distributive laws, and to infer the duality or dichotomy principle (nothing can belong to a set and to its complement) algebraically from idempotency (the intersection of a set with itself is the same set). The book is laid out as a series of definitions, propositions, and “rules,” all designed to instruct in algebraic manipulation of logical formulas (and, later in the work, of probabilities).
乔治·布尔是一位很大程度上自学成才的数学家,他是一位英国鞋匠的儿子,并且只接受过初等教育。他在二十多岁的时候成功发表了严肃的数学研究,同时也以校长的身份养活自己。十年后,他在爱尔兰科克的新女王学院获得了教授职位,这是维多利亚女王同时捐赠的三所学院之一,“以促进爱尔兰的学术进步”(另外两所分别位于贝尔法斯特和戈尔韦)。正是在这个远离牛津和剑桥伟大数学中心的地方,布尔写下了他独特的著作,我们从中选取了一小部分。
George Boole was a largely self-taught mathematician, the son of an English shoemaker and the beneficiary of only primary education. He managed to publish serious mathematical research in his twenties while supporting himself as a schoolmaster. He won a professorship a decade later at the new Queen’s College at Cork in Ireland, one of three colleges Queen Victoria had simultaneously endowed “for the advancement of learning in Ireland” (the other two were in Belfast and Galway). It was from this perch, away from the great mathematical centers of Oxford and Cambridge, that Boole wrote his unique opus, from which we include a small selection.
在这些页面中,布尔耐心地解释了命题逻辑的一些规则和方法,包括交换律。现在这一切对我们来说似乎很平常,我们已经习惯了当时新颖的想法,即真与假可以用代数来操纵。近一个世纪后,克劳德·香农将布尔的思想大量融入新兴的数字电路领域(第 8 章)。我们的选择以真值函数的想法结束,并认识到通过将一个变量的值固定为真或假,可以将这种函数分解为子函数。布尔将这个过程比作可微函数的泰勒级数展开。
In these pages Boole patiently explains some of the rules and methods of propositional logic, including the commutative law. It all seems quite routine to us now, so used are we to the then-novel idea that true and false can be manipulated algebraically. Almost a century later, Claude Shannon incorporated Boole’s ideas wholesale into the emerging world of digital circuitry (chapter 8). Our selection closes with the idea of a truth-function and the insight that such a function can be decomposed into subfunctions by fixing the value of one variable to be true or false. Boole likened this process to a Taylor series expansion of a differentiable function.
布尔在爱尔兰寒冷的雨中步行 3 英里去演讲时感染感染,享年 49 岁(MacHale,2014)。
Boole died at age 49 of an infection contracted while walking three miles to his lecture in the cold Irish rain (MacHale, 2014).
以下论文的目的是研究进行推理的心灵运作的基本规律;用微积分的符号语言表达它们,并在此基础上建立逻辑学并构造其方法;使该方法本身成为应用数学概率学说的一般方法的基础;最后,从这些调查过程中看到的各种真理要素中收集一些关于人类心灵的本质和构成的可能暗示。……
THE design of the following treatise is to investigate the fundamental laws of those operations of the mind by which reasoning is performed; to give expression to them in the symbolical language of a Calculus, and upon this foundation to establish the science of Logic and construct its method; to make that method itself the basis of a general method for the application of the mathematical doctrine of Probabilities; and, finally, to collect from the various elements of truth brought to view in the course of these inquiries some probable intimations concerning the nature and constitution of the human mind. …
以下定义列举了符号的基本属性。
The essential properties of signs are enumerated in the following definition.
定义——符号是任意标记,具有固定的解释,并且可以与其他符号组合,并遵守依赖于它们相互解释的固定法律。
Definition.—A sign is an arbitrary mark, having a fixed interpretation, and susceptible of combination with other signs in subjection to fixed laws dependent upon their mutual interpretation.
……
…
语言作为推理工具的所有操作都可以通过由以下元素组成的符号系统来进行,即:
All the operations of Language, as an instrument of reasoning, may be conducted by a system of signs composed of the following elements, viz.:
第一。文字符号,如 x、y 等,将事物表示为我们概念的主题。
1st. Literal symbols, as x, y, &c., representing things as subjects of our conceptions.
第二。运算符号,如+、−、× ,代表心灵的运算,通过这些运算,事物的概念被组合或解析,从而形成涉及相同元素的新概念。
2nd. Signs of operation, as +, −, ×, standing for those operations of the mind by which the conceptions of things are combined or resolved so as to form new conceptions involving the same elements.
第三。身份符号, =。
3rd. The sign of identity, =.
这些逻辑符号在使用时遵循一定的规律,与代数科学中相应符号的规律部分一致,部分不同。
And these symbols of Logic are in their use subject to definite laws, partly agreeing with and partly differing from the laws of the corresponding symbols in the science of Algebra.
假设理性话语的真正要素的标准是,它们应该能够以最简单的形式并按照最简单的法则进行组合,因此组合应该产生所有其他已知和可想象的语言形式;并采用这一原则,考虑以下分类。
Let it be assumed as a criterion of the true elements of rational discourse, that they should be susceptible of combination in the simplest forms and by the simplest laws, and thus combining should generate all other known and conceivable forms of language; and adopting this principle, let the following classification be considered.
对于这一类,我们显然可以指的是专有名词或普通名词,以及形容词。实际上,它们可以被认为仅在这方面有所不同,即前者表达了它所指的单个事物或事物的实质存在;而后者表达了它所指的单个事物的实质存在;后者意味着存在。如果我们将普遍理解的主体“存在”或“事物”附加到形容词上,它实际上就变成了一个实体,并且可以出于推理的所有基本目的而被实体所取代。无论在心理方面的每一个具体方面,说“水是流体”与说“水是流体”都是同一件事;它至少在推理过程的表达上是等价的。
To this class we may obviously refer the substantive proper or common, and the adjective. These may indeed be regarded as differing only in this respect, that the former expresses the substantive existence of the individual thing or things to which it refers; the latter implies that existence. If we attach to the adjective the universally understood subject “being” or “thing,” it becomes virtually a substantive, and may for all the essential purposes of reasoning be replaced by the substantive. Whether or not, in every particular of the mental regard, it is the same thing to say, “Water is a fluid thing,” as to say, “Water is fluid”; it is at least equivalent in the expression of the processes of reasoning.
同样清楚的是,对于上述类别,我们必须指的是通常可以用来表达某种情况或关系的任何符号,其详细说明将涉及许多符号的使用。诗歌用语中的绰号非常常见。它们通常是复合形容词,单独履行多词描述的职责。荷马的“深涡海洋”体现了βαθυδίνnς这个词的虚拟描述。传统上,任何其他描述,无论是针对想象力还是针对智力,都可以同样地由单个符号来表示,该符号的使用在所有要点上都遵循与形容词“好”或“伟大”的使用相同的法则。 ”。与主语“事物”结合,这样的符号实际上就变成了实体;并且通过单个实体可以表达事物和质量的组合含义。
It is clear also, that to the above class we must refer any sign which may conventionally be used to express some circumstance or relation, the detailed exposition of which would involve the use of many signs. The epithets of poetic diction are very frequently of this kind. They are usually compounded adjectives, singly fulfilling the office of a many-worded description. Homer’s “deep-eddying ocean” embodies a virtual description in the single word βαθυδίνης. And conventionally any other description addressed either to the imagination or to the intellect might equally be represented by a single sign, the use of which would in all essential points be subject to the same laws as the use of the adjective “good” or “great.” Combined with the subject “thing,” such a sign would virtually become a substantive; and by a single substantive the combined meaning both of thing and quality might be expressed.
现在让我们考虑一下上述含义中使用的符号x、y等所遵循的定律。
Let us now consider the laws to which the symbols x, y, &c., used in the above sense, are subject.
在x代表白色物体和y羊的情况下,该方程的任何一个成员都将代表“白羊”类别。概念形成的顺序可能有所不同,但概念所包含的个体事物却没有差异。以类似的方式,如果x代表“河口”,y代表“河流”,则表达式xy和yx将无差别地代表“是河口的河流”或“河口是河流”,在这种情况下的组合在普通语言中是由两个实词组成,而不是像上一个例子中那样由一个实词和一个形容词组成。
In the case of x representing white things, and y sheep, either of the members of this equation will represent the class of “white sheep.” There may be a difference as to the order in which the conception is formed, but there is none as to the individual things which are comprehended under it. In like manner, if x represent “estuaries,” and y “rivers,” the expressions xy and yx will indifferently represent “rivers that are estuaries,” or “estuaries that are rivers,” the combination in this case being in ordinary language that of two substantives, instead of that of a substantive and an adjective as in the previous instance.
设第三个符号,如z,表示“可通航”一词适用的事物类别,并且以下表达式中的任何一个zxy、zyx、xyz等将表示“可通航河流”类别那是河口。”
Let there be a third symbol, as z, representing that class of things to which the term “navigable” is applicable, and any one of the following expressions, zxy, zyx, xyz, &c., will represent the class of “navigable rivers that are estuaries.”
如果其中一个描述性术语对另一个描述性术语有某种隐含的引用,则只需将该引用明确地包含在其规定的含义中,以使上述评论仍然适用。因此,如果x代表“智慧”,y代表“谋士”,我们就必须定义x是指绝对意义上的智慧,还是仅仅指谋士的智慧。根据这样的定义,定律xy = yx仍然有效。
If one of the descriptive terms should have some implied reference to another, it is only necessary to include that reference expressly in its stated meaning, in order to render the above remarks still applicable. Thus, if x represent “wise” and y “counsellor,” we shall have to define whether x implies wisdom in the absolute sense, or only the wisdom of counsel. With such definition the law xy = yx continues to be valid.
因此,我们可以使用符号 x、y、z 等来代替受解释规则约束的实词、形容词和描述性短语,即任何将这些符号写在一起的表达方式应代表其几种含义共同适用的所有物体或个人,并且符号彼此相继的顺序无关紧要的法律。
We are permitted, therefore, to employ the symbols x, y, z, &c., in the place of the substantives, adjectives, and descriptive phrases subject to the rule of interpretation, that any expression in which several of these symbols are written together shall represent all the objects or individuals to which their several meanings are together applicable, and to the law that the order in which the symbols succeed each other is indifferent.
由于解释规则已得到充分例证,我认为在定义用于形容词的符号的解释时没有必要总是表达主语“事物”。当我说,让x代表“好”时,人们会理解,只有当该质量的主语由另一个符号提供时, x才代表“好”,并且单独使用时,它的解释将是“好东西”。
As the rule of interpretation has been sufficiently exemplified, I shall deem it unnecessary always to express the subject “things” in defining the interpretation of a symbol used for an adjective. When I say, let x represent “good,” it will be understood that x only represents “good” when a subject for that quality is supplied by another symbol, and that, used alone, its interpretation will be “good things.”
首先,我要指出的是,这个法则是思想法则,严格来说,并不是事物法则。一个物体的性质或属性的顺序差异,除了所有因果关系问题之外,仅仅是概念上的差异。定律(4.1)表达了一个普遍真理,即同一事物可以用不同的方式来设想,并说明了这种差异的本质;它所做的仅此而已。
First, I would remark, that this law is a law of thought, and not, properly speaking, a law of things. Difference in the order of the qualities or attributes of an object, apart from all questions of causation, is a difference in conception merely. The law (4.1) expresses as a general truth, that the same thing may be conceived in different ways, and states the nature of that difference; and it does no more than this.
其次,作为一种思维规律,它实际上是在语言规律中发展起来的,语言是思维的产物和工具。虽然散文写作的趋势是趋于统一,但即使如此,形容词序列的顺序在其含义上是绝对的,并且应用于同一主题,也是无关紧要的,但诗歌的措辞从同一法律的延伸中借用了其丰富的多样性。实质自由也。弥尔顿的语言因这种多样性而特别显着。不仅实词经常出现在形容词之前是有资格的,但常常被放在他们中间。在调用 Light 的前几行中,我们遇到如下示例:
Secondly, as a law of thought, it is actually developed in a law of Language, the product and the instrument of thought. Though the tendency of prose writing is toward uniformity, yet even there the order of sequence of adjectives absolute in their meaning, and applied to the same subject, is indifferent, but poetic diction borrows much of its rich diversity from the extension of the same lawful freedom to the substantive also. The language of Milton is peculiarly distinguished by this species of variety. Not only does the substantive often precede the adjectives by which it is qualified, but it is frequently placed in their midst. In the first few lines of the invocation to Light, we meet with such examples as the following:
“天之子孙。” ”
“Offspring of heaven first-born.”
“不断上升的水域世界黑暗而深邃。”
“The rising world of waters dark and deep.”
“明亮本质的明亮流出创造。”
“Bright effluence of bright essence increate.”
现在,这些倒置的形式不仅仅是诗意许可的成果。它们是由思想的密切法则所认可的自由的自然表达,但出于方便的原因,并未在日常语言使用中行使。
Now these inverted forms are not simply the fruits of a poetic license. They are the natural expressions of a freedom sanctioned by the intimate laws of thought, but for reasons of convenience not exercised in the ordinary use of language.
第三,( 4.1 )所表达的定律的特征可以是文字符号x、y、z是可交换的,就像代数的符号一样。这么说,并没有肯定代数中的乘法过程(其基本定律由方程xy = yx表示)本身与上面用xy表示的逻辑组合过程具有任何类比;但前提是,如果算术过程和逻辑过程以相同的方式表达,那么它们的符号表达将服从相同的形式法则。在这两个案件中,这种服从的证据是截然不同的。
Thirdly, the law expressed by (4.1) may be characterized by saying that the literal symbols x, y, z, are commutative, like the symbols of Algebra. In saying this, it is not affirmed that the process of multiplication in Algebra, of which the fundamental law is expressed by the equation xy = yx, possesses in itself any analogy with that process of logical combination which xy has been made to represent above; but only that if the arithmetical and the logical process are expressed in the same manner, their symbolical expressions will be subject to the same formal law. The evidence of that subjection is in the two cases quite distinct.
事实上,它是那些符号的第二一般法则的表达,通过这些符号来象征性地表示名称、品质或描述。
and is, in fact, the expression of a second general law of those symbols by which names, qualities, or descriptions, are symbolically represented.
读者必须记住,虽然前面形成的例子中的符号x和y具有彼此不同的含义,但没有什么可以阻止我们赋予它们完全相同的含义。显然,它们的实际含义越接近,由组合xy表示的事物类别与x表示的类别以及y表示的类别就越接近。方程(4.2)的证明中假设的情况是意义的绝对同一性。它所表达的法律实际上是用语言来例证的。对任何主题说“好,好”,尽管是一种繁琐且无用的重复,但与说“好”是一样的。因此,“好、好”的人,就等于“好”的人。有时确实会使用这种重复的词语来提高质量或加强肯定。但这种影响只是次要的和常规;它不是建立在语言和思想的内在关系之上的。我们在自然界中观察到的或我们自己进行的大多数操作都是这样一种情况,即它们的效果会通过重复而增强,这种情况使我们准备好在语言中期待同样的事情,甚至在我们打算说话时使用重复强调。但无论是在严格推理还是在精确论述中,这种做法都没有任何正当理由。
The reader must bear in mind that although the symbols x and y in the examples previously formed received significations distinct from each other, nothing prevents us from attributing to them precisely the same signification. It is evident that the more nearly their actual significations approach to each other, the more nearly does the class of things denoted by the combination xy approach to identity with the class denoted by x, as well as with that denoted by y. The case supposed in the demonstration of the equation (4.2) is that of absolute identity of meaning. The law which it expresses is practically exemplified in language. To say “good, good,” in relation to any subject, though a cumbrous and useless pleonasm, is the same as to say “good.” Thus “good, good” men, is equivalent to “good” men. Such repetitions of words are indeed sometimes employed to heighten a quality or strengthen an affirmation. But this effect is merely secondary and conventional; it is not founded in the intrinsic relations of language and thought. Most of the operations which we observe in nature, or perform ourselves, are of such a kind that their effect is augmented by repetition, and this circumstance prepares us to expect the same thing in language, and even to use repetition when we design to speak with emphasis. But neither in strict reasoning nor in exact discourse is there any just ground for such a practice.
我们不仅能够接受适用于所考虑的群体中每个个体的以名称、性质或环境为特征的物体概念,而且能够形成由部分群体组成的一组物体的总体概念,每个群体其中单独命名或描述。为此,我们使用连词“and”、“or”等。“树木和矿物”、“荒山或肥沃的山谷”就是这样的例子。严格来说,插入在描述两类或多类对象的术语之间的“和”、“或”一词意味着这些类是完全不同的,因此在另一类中找不到其中一类的成员。在这方面以及在所有其他方面,“和”“或”一词与代数中的符号“+”类似,并且它们的定律是相同的。因此,抛开传统含义,“男人和女人”这个词与“女人和男人”这个词是等价的。设x代表“男性”,y代表“女性”;让 + 代表“与”和“或”,那么我们有x + y = y + x,如果x和y代表数字,并且 + 是算术加法的符号,则该方程同样成立。
We are not only capable of entertaining the conceptions of objects, as characterized by names, qualities, or circumstances, applicable to each individual of the group under consideration, but also of forming the aggregate conception of a group of objects consisting of partial groups, each of which is separately named or described. For this purpose we use the conjunctions “and,” “or,” &c. “Trees and minerals,” “barren mountains, or fertile vales,” are examples of this kind. In strictness, the words “and,” “or,” interposed between the terms descriptive of two or more classes of objects, imply that those classes are quite distinct, so that no member of one is found in another. In this and in all other respects the words “and” “or” are analogous with the sign + in algebra, and their laws are identical. Thus the expression “men and women” is, conventional meanings set aside, equivalent with the expression “women and men.” Let x represent “men,” y, “women”; and let + stand for “and” and “or,” then we have x + y = y + x, an equation which would equally hold true if x and y represented numbers, and + were the sign of arithmetical addition.
让符号z代表形容词“欧洲人”,那么实际上,说“欧洲男人和女人”与说“欧洲男人和欧洲女人”是同一件事,我们有
Let the symbol z stand for the adjective “European,” then since it is, in effect, the same thing to say “European men and women,” as to say “European men and European women,” we have
如果x、y和z是数字符号,并且是两个文字符号的并置来表示它们的代数积,那么这个方程也同样成立,就像前面给出的逻辑意义一样,它代表了对象的类别这两个绰号连在一起属于。
And this equation also would be equally true were x, y, and z symbols of number, and were the juxtaposition of two literal symbols to represent their algebraic product, just as in the logical signification previously given, it represents the class of objects to which both the epithets conjoined belong.
上述是控制符号+的使用的法则,这里用来表示将部分聚合成整体的积极操作。但是,影响某些积极变化的行动的想法本身似乎向我们暗示了相反或消极行动的想法,其效果是消除前一个行动所做的事情。因此,我们不能想象将部分聚集成一个整体是可能的,也不能想象将部分与整体分开也是可能的。这种操作我们用通用语言表达,除了“除了亚洲人之外的所有人”、“除了君主制国家之外的所有国家”等符号。这里暗示被排除的事物构成它们被排除的事物的一部分。由于我们已经用+号来表示聚合运算,所以我们可以用-减号来表示上面描述的负运算。因此,如果x代表男性,y代表亚洲人,即亚洲人,那么“除Asiatics”将用x − y表示。如果我们用x表示“国家”,用y表示“具有君主制形式”的描述性属性,那么“除君主制国家之外的所有国家”的概念将由x − xy表示。
The above are the laws which govern the use of the sign +, here used to denote the positive operation of aggregating parts into a whole. But the very idea of an operation effecting some positive change seems to suggest to us the idea of an opposite or negative operation, having the effect of undoing what the former one has done. Thus we cannot conceive it possible to collect parts into a whole, and not conceive it also possible to separate a part from a whole. This operation we express in common language by the sign except, as, “All men except Asiatics,” “All states except those which are monarchical.” Here it is implied that the things excepted form a part of the things from which they are excepted. As we have expressed the operation of aggregation by the sign +, so we may express the negative operation above described by − minus. Thus if x be taken to represent men, and y, Asiatics, i.e. Asiatic men, then the conception of “All men except Asiatics” will be expressed by x − y. And if we represent by x, “states,” and by y the descriptive property “having a monarchical form,” then the conception of “All states except those which are monarchical” will be expressed by x − xy.
由于对于推理的所有基本目的而言,无论我们按照言语顺序首先还是最后表达例外情况,都无关紧要,因此我们编写任何一系列术语的顺序也无关紧要,其中一些术语受符号 - 的影响。因此,正如在普通代数中一样,我们有:
As it is indifferent for all the essential purposes of reasoning whether we express excepted cases first or last in the order of speech, it is also indifferent in what order we write any series of terms, some of which are affected by the sign −. Thus we have, as in the common algebra,
仍然用x代表“男人”类,用y代表“亚洲人”,让z代表形容词“白人”。现在,将形容词“白人”应用于短语“除亚洲人之外的男性”所表达的男性集合,就等于说“白人,除了白人亚洲人”。因此我们有
Still representing by x the class “men,” and by y “Asiatics,” let z represent the adjective “white.” Now to apply the adjective “white” to the collection of men expressed by the phrase “Men except Asiatics,” is the same as to say, “White men, except white Asiatics.” Hence we have
这也符合普通代数的规律。
This is also in accordance with the laws of ordinary algebra.
方程( 4.3 )和( 4.4 )可以被认为是单个一般定律的例证,可以通过文字符号x、y、z和c来表述。在他们的运作中是分布式的。该法则表达的一般事实是:如果任何品质或情况被归因于一个群体的所有成员,无论是通过聚集还是排除部分群体而形成的,所得到的概念与该品质或情况相同首先将其归因于部分组的每个成员,然后进行聚合或排除。归属于整体成员的东西也归属于其所有部分的成员,无论这些部分是如何连接在一起的。
The equations (4.3) and (4.4) may be considered as exemplification of a single general law, which may be stated by saying, that the literal symbols, x, y, z, &c. are distributive in their operation. The general fact which that law expresses is this, viz.: If any quality or circumstance is ascribed to all the members of a group, formed either by aggregation or exclusion of partial groups, the resulting conception is the same as if the quality or circumstance were first ascribed to each member of the partial groups, and the aggregation or exclusion effected afterwards. That which is ascribed to the members of the whole is ascribed to the members of all its parts, howsoever those parts are connected together.
尽管所有动词都可以适当地引用此类,但出于逻辑的目的,将其视为仅包含实词动词is或are就足够了,因为所有其他动词都可以解析为该元素,并且包含其中一个符号因为这些符号用于表达各种性质或情况,因此它们可以用来表达动词主语的主动或被动关系,考虑过去、现在或未来时间。因此,命题“凯撒征服了高卢人”可以被解析为“凯撒是征服高卢人的人”。我认为这一分析的基础如下:——除非我们理解“征服了高卢人”的含义,即“征服了高卢人的人”这个表达方式,否则我们就无法理解所讨论的句子。因此,它确实是该句子的一个要素;另一个元素是“Cæsar”,还需要另一个元素,系词是为了显示这两者的联系。然而,我并不断言,除了上述之外,没有其他方式来思考“凯撒征服了高卢人”这一命题所表达的关系。但这里给出的分析对于所采取的特定观点而言是正确的,并且足以满足逻辑演绎的目的。可能有人会说,希腊语的被动语态和将来分词意味着存在所主张的原则,即:符号is或are可以被视为每个人称动词的一个元素。
Though all verbs may with propriety be referred to this class, it is sufficient for the purposes of Logic to consider it as including only the substantive verb is or are, since every other verb may be resolved into this element, and one of the signs included under Class I. For as those signs are used to express quality or circumstance of every kind, they may be employed to express the active or passive relation of the subject of the verb, considered with reference either to past, to present, or to future time. Thus the Proposition, “Cæsar conquered the Gauls,” may be resolved into “Cæsar is he who conquered the Gauls.” The ground of this analysis I conceive to be the following:—Unless we understand what is meant by having conquered the Gauls, i.e. by the expression “One who conquered the Gauls,” we cannot understand the sentence in question. It is, therefore, truly an element of that sentence; another element is “Cæsar,” and there is yet another required, the copula is to show the connexion of these two. I do not, however, affirm that there is no other mode than the above of contemplating the relation expressed by the proposition, “Cæsar conquered the Gauls”; but only that the analysis here given is a correct one for the particular point of view which has been taken, and that it suffices for the purposes of logical deduction. It may be remarked that the passive and future participles of the Greek language imply the existence of the principle which has been asserted, viz.: that the sign is or are may be regarded as an element of every personal verb.
现在,如果恒星确实是太阳和行星,那么,除了行星之外的恒星都是太阳。这将给出方程x − z = y,因此它必须是 ( 4.5 )的推论。因此,通过改变其符号, z项已从方程的一侧移至另一侧。这符合转置代数规则。
Now if it be true that the stars are the suns and the planets, it will follow that the stars, except the planets, are suns. This would give the equation x − z = y, which must therefore be a deduction from (4.5). Thus a term z has been removed from one side of an equation to the other by changing its sign. This is in accordance with the algebraic rule of transposition.
但我们可以立即肯定一般公理,而不是纠缠于特定的情况:——
But instead of dwelling upon particular cases, we may at once affirm the general axioms:—
第一。如果相等的事物加上相等的事物,则整体相等。
1st. If equal things are added to equal things, the wholes are equal.
第二。如果从相等的事物中取出相等的事物,则余数相等。
2nd. If equal things are taken from equal things, the remainders are equal.
因此,看来我们可以添加或减去方程,并使用上面给出的转置规则,就像普通代数一样。
And it hence appears that we may add or subtract equations, and employ the rule of transposition above given just as in common algebra.
再次:如果两类事物x和y是相同的,也就是说,如果一类的所有成员都是另一类的成员,那么拥有给定属性z的一类的那些成员将与那些成员相同具有相同属性z的另一个。因此,如果我们有方程x = y;那么无论z代表什么类或属性,我们也有zx = zy。
Again: If two classes of things, x and y, be identical, that is, if all the members of the one are members of the other, then those members of the one class which possess a given property z will be identical with those members of the other which possess the same property z. Hence if we have the equation x = y; then whatever class or property z may represent, we have also zx = zy.
这在形式上与代数定律相同:如果方程的两个成员都乘以相同的量,则乘积相等。
This is formally the same as the algebraic law:—If both members of an equation are multiplied by the same quantity, the products are equal.
以类似的方式可以证明,如果两个方程的相应成员相乘,则所得方程为真。
In like manner it may be shown that if the corresponding members of two equations are multiplied together, the resulting equation is true.
但稍微考虑一下就会发现,即使在普通代数中,该公理也不具备已考虑的其他公理的一般性。从方程zx = zy推导出方程x = y仅当已知z不等于 0 时才有效。如果此时值z = 0 在代数系统中被认为是可接受的,则上述公理不再适用,并且之前举例的类比至少仍然没有被打破。
But a little consideration will show that even in common algebra that axiom does not possess the generality of those other axioms which have been considered. The deduction of the equation x = y from the equation zx = zy is only valid when it is known that z is not equal to 0. If then the value z = 0 is supposed to be admissible in the algebraic system, the axiom above stated ceases to be applicable, and the analogy before exemplified remains at least unbroken.
实词、形容词、动词,以及助词和,except,我们已经考虑过。代词可以被视为实词或形容词的特定形式。副词修饰动词的含义,但不影响其性质。介词有助于表达环境或关系,因此往往会赋予字面符号的含义精确和细节。连词if、或者、或, 主要用于表达命题之间的关系,下文将证明,相同的关系可以完全用基本符号来表达,这些基本符号在解释上是类似的,并且在形式和规律上与本章已解释其用途和含义。至于任何剩余的言语元素,经过检查,我们会发现它们要么被用来赋予话语术语更明确的意义,从而进入对已经考虑过的文字符号的解释,要么被用来表达某种意义。伴随着一个命题的表达的情感或感觉状态,因此不属于我们目前所关心的理解的范围。其使用经验将证明所采用的分类的充分性。
The substantive, the adjective, and the verb, together with the particles and, except, we have already considered. The pronoun may be regarded as a particular form of the substantive or the adjective. The adverb modifies the meaning of the verb, but does not affect its nature. Prepositions contribute to the expression of circumstance or relation, and thus tend to give precision and detail to the meaning of the literal symbols. The conjunctions if, either, or, are used chiefly in the expression of relation among propositions, and it will hereafter be shown that the same relations can be completely expressed by elementary symbols analogous in interpretation, and identical in form and law with the symbols whose use and meaning have been explained in this chapter. As to any remaining elements of speech, it will, upon examination, be found that they are used either to give a more definite significance to the terms of discourse, and thus enter into the interpretation of the literal symbols already considered, or to express some emotion or state of feeling accompanying the utterance of a proposition, and thus do not belong to the province of the understanding, with which alone our present concern lies. Experience of its use will testify to the sufficiency of the classification which has been adopted.
这种适当性的基础不能存在于任何解释共同体中。因为在像逻辑和算术这样真正不同的思想体系中(我使用后一个术语作为数字科学的最广泛含义),正确地说,不存在主体共同体。他们中的一个熟悉事物的概念,另一个只考虑它们的数字关系。但是,由于任何推理系统的形式和方法都直接取决于符号所遵循的法则,并且仅通过上述联系间接地取决于它们的解释,因此使用相同的符号可能既合适又有利。不同思想体系中的符号,只要可以对它们进行解释,使它们的形式法则相同,并且使用一致。那么,这种雇佣的基础将不再是解释的共同体,而是正式法律的共同体,它们在各自的体系中受到约束。除了仔细观察和比较那些被视为独立于所考虑的系统的解释之外的结果之外,也不能建立在任何其他基础上的正式法律共同体。
The ground of this propriety cannot consist in any community of interpretation. For in systems of thought so truly distinct as those of Logic and Arithmetic (I use the latter term in its widest sense as the science of Number), there is, properly speaking, no community of subject. The one of them is conversant with the very conceptions of things, the other takes account solely of their numerical relations. But inasmuch as the forms and methods of any system of reasoning depend immediately upon the laws to which the symbols are subject, and only mediately, through the above link of connexion, upon their interpretation, there may be both propriety and advantage in employing the same symbols in different systems of thought, provided that such interpretations can be assigned to them as shall render their formal laws identical, and their use consistent. The ground of that employment will not then be community of interpretation, but the community of the formal laws, to which in their respective systems they are subject. Nor must that community of formal laws be established upon any other ground than that of a careful observation and comparison of those results which are seen to flow independently from the interpretations of the systems under consideration.
这些观察结果将解释以下提案中采用的调查过程。逻辑的文字符号普遍服从x 2 = x的规律。数字符号中只有两个,0和1,满足这个定律。但是,这些符号中的每一个也都服从于数字量值系统中其自身特有的法则,这表明了这样的疑问:必须对逻辑的字面符号给出什么样的解释,以便同样的独特和形式的法则可以被使用。也在逻辑系统中实现。
These observations will explain the process of inquiry adopted in the following Proposition. The literal symbols of Logic are universally subject to the law whose expression is x2 = x. Of the symbols of Number there are two only, 0 and 1, which satisfy this law. But each of these symbols is also subject to a law peculiar to itself in the system of numerical magnitude, and this suggests the inquiry, what interpretations must be given to the literal symbols of Logic, in order that the same peculiar and formal laws may be realized in the logical system also.
无论y代表什么数字。为了在逻辑系统中遵守这个形式法则,我们必须给符号 0 分配这样一种解释:0 y代表的类可以与0 代表的类相同,无论y是什么类。稍微考虑一下就会发现,如果符号 0 代表 Nothing,则满足此条件。根据之前的定义,我们可以将 Nothing 称为类。事实上,虚无和宇宙是两个类扩展的限制,因为它们是通用名称的可能解释的限制,这些通用名称都不能涉及比“虚无”中包含的个体更少的个体,也不能涉及比包含在宇宙中的个体更多的个体。
whatever number y may represent. That this formal law may be obeyed in the system of Logic, we must assign to the symbol 0 such an interpretation that the class represented by 0y may be identical with the class represented by 0, whatever the class y may be. A little consideration will show that this condition is satisfied if the symbol 0 represent Nothing. In accordance with a previous definition, we may term Nothing a class. In fact, Nothing and Universe are the two limits of class extension, for they are the limits of the possible interpretations of general names, none of which can relate to fewer individuals than are comprised in Nothing, or to more than are comprised in the Universe.
现在,无论类y是什么,它和“无”类所共有的个体与包含在“无”类中的个体是相同的,因为它们都不是。因此,通过将“Nothing”的解释分配给 0,就满足了定律( 4.6 );并且它不满足与类y的完美一般特征的一致。
Now whatever the class y may be, the individuals which are common to it and to the class “Nothing” are identical with those comprised in the class “Nothing,” for they are none. And thus by assigning to 0 the interpretation Nothing, the law (4.6) is satisfied; and it is not otherwise satisfied consistently with the perfectly general character of the class y.
其次,符号1在数系中满足以下定律,即:
Secondly, the symbol 1 satisfies in the system of Number the following law, viz.,
无论y代表什么数字。假设这个形式方程在本工作的系统中同样有效,其中 1 和y代表类,看来符号 1 必须代表这样一个类,即在任何提议的类y中找到的所有个体也都是类y和 1 所代表的类所共有的个体 1 y。这里稍微考虑一下就会表明,1 所代表的类一定是“宇宙”,因为这是唯一一个包含所有个体的类存在于任何类中。因此,逻辑系统中符号0和1的各自解释是“无”和“宇宙”。
whatever number y may represent. And this formal equation being assumed as equally valid in the system of this work, in which 1 and y represent classes, it appears that the symbol 1 must represent such a class that all the individuals which are found in any proposed class y are also all the individuals 1y that are common to that class y and the class represented by 1. A little consideration will here show that the class represented by 1 must be “the Universe,” since this is the only class in which are found all the individuals that exist in any class. Hence the respective interpretations of the symbols 0 and 1 in the system of Logic are Nothing and Universe.
如果 x 代表任何类别的对象,则1 − x 代表相反或补充的对象类别,即包括类 x 中未包含的所有对象的类别。
If x represent any class of objects, then will 1 − x represent the contrary or supplementary class of objects., i.e. the class including all objects which are not comprehended in the class x.
为了使概念更加清晰,让x代表人类类别,并根据最后一个命题,让我们用 1 来表达宇宙;现在,如果从由“人”和“非人”组成的宇宙概念中,我们排除“人”的概念,那么由此产生的概念就是相反类别的概念,即“非人”。因此,“非人”类将由 1 − x表示。而且,一般来说,无论什么类别的对象由符号x表示,相反的类别都将由 1 − x表示。
For greater distinctness of conception let x represent the class men, and let us express, according to the last Proposition, the Universe by 1; now if from the conception of the Universe, as consisting of “men” and “not-men,” we exclude the conception of “men,” the resulting conception is that of the contrary class, “not-men.” Hence the class “not-men” will be represented by 1 −x. And, in general, whatever class of objects is represented by the symbol x, the contrary class will be expressed by 1 − x.
形而上学家的公理被称为矛盾原理,它断言任何存在物都不可能既拥有某种品质,又同时不拥有这种品质,它是基本思想法则的结果,其表达是x 2 = x。
That axiom of metaphysicians which is termed the principle of contradiction, and which affirms that it is impossible for any being to possess a quality, and at the same time not to possess it, is a consequence of the fundamental law of thought, whose expression is x2 = x.
让我们把这个方程写成x − x 2 = 0的形式,由此我们有
Let us write this equation in the form x − x2 = 0, whence we have
这两种变换都由组合和转置的公理定律证明是合理的(第4.2.13节)。为了简单起见,让我们对符号x给予人类的特殊解释,那么 1 − x将代表“非人类”类别(第4.3.14节)。现在,两个类别的表达式的形式乘积代表了它们共同的个体类别(第4.2.9节)。因此,x (1 − x ) 将代表其成员同时是“男人”和“非男人”的阶级,并且等式(4.7)因此表达了这样的原则:一个其成员同时是男人和非男人的阶级男人不存在。换句话说,同一个人不可能同时是人又不是人。现在让符号x的含义从代表“人”扩展到代表具有任何品质的任何类别的存在;因此,等式(4.7)将表达一个存在体不可能同时拥有某种品质和不拥有该品质。但这同样是亚里士多德所描述的所有哲学基本公理的“矛盾原理”。“同一个品质不可能既属于又不属于同一事物。……这是所有原则中最确定的。……因此,示威者将此视为最终意见。因为它本质上是所有其他公理的来源。”
both these transformations being justified by the axiomatic laws of combination and transposition (§4.2.13). Let us, for simplicity of conception, give to the symbol x the particular interpretation of men, then 1 − x will represent the class of “not-men” (§4.3.14). Now the formal product of the expressions of two classes represents that class of individuals which is common to them both (§4.2.9). Hence x(1 − x) will represent the class whose members are at once “men,” and “not men,” and the equation (4.7) thus express the principle, that a class whose members are at the same time men and not men does not exist. In other words, that it is impossible for the same individual to be at the same time a man and not a man. Now let the meaning of the symbol x be extended from the representing of “men,” to that of any class of beings characterized by the possession of any quality whatever; and the equation (4.7) thus will then express that it is impossible for a being to possess a quality and not to possess that quality at the same time. But this is identically that “principle of contradiction” which Aristotle has described as the fundamental axiom of all philosophy. “It is impossible that the same quality should both belong and not belong to the same thing. … This is the most certain of all principles. … Wherefore they who demonstrate refer to this as an ultimate opinion. For it is by nature the source of all the other axioms.”
引入上述解释并不是因为它在现行体系中的直接价值,而是为了说明智力哲学中的一个重要事实,即通常被认为是形而上学的基本公理的是而是数学形式的思想法则的结果。我还希望注意这样的情况,即表达基本思想定律的方程(4.7)是二次方程。……
The above interpretation has been introduced not on account of its immediate value in the present system, but as an illustration of a significant fact in the philosophy of the intellectual powers, viz., that what has been commonly regarded as the fundamental axiom of metaphysics is but the consequence of a law of thought, mathematical in its form. I desire to direct attention also to the circumstance that the equation (4.7) thus in which that fundamental law of thought is expressed is an equation of the second degree. …
所有逻辑命题都可以被认为属于两个大类中的一个或另一个,它们可以分别被赋予“主要”或“具体命题”以及“次要”或“抽象命题”的名称。
All logical propositions may be considered as belonging to one or the other of two great classes, to which the respective names of “Primary” or “Concrete Propositions,” and “Secondary” or “Abstract Propositions,” may be given.
我们所做的每一个断言都可以参考以下两种类型中的一种或另一种。它要么表达事物之间的关系,要么表达或等同于表达,命题之间的关系。正确地说,关于事物的属性、它们所表现的现象或它们所处的环境的断言,就是对事物之间关系的断言。从逻辑的角度来说,说“雪是白色的”就相当于说“雪是白色的东西”。出于相同的目的,关于事实或事件及其相互联系和依赖性的断言通常等同于这样的断言:关于这些事件的这样或那样的命题在它们的相互真或假方面彼此之间具有一定的关系。前一类与事物有关的命题,我称之为“初级”;后一类与命题有关,我称之为“次要”。这种区别在实践中几乎与命题作为绝对命题或假设命题的常见逻辑区别一样广泛,但并不完全一致。
Every assertion that we make may be referred to one or the other of the two following kinds. Either it expresses a relation among things, or it expresses, or is equivalent to the expression of, a relation among propositions. An assertion respecting the properties of things, or the phænomena which they manifest, or the circumstances in which they are placed, is, properly speaking, the assertion of a relation among things. To say that “snow is white,” is for the ends of logic equivalent to saying, that “snow is a white thing.” An assertion respecting facts or events, their mutual connexion and dependence, is, for the same ends, generally equivalent to the assertion, that such and such propositions concerning those events have a certain relation to each other as respects their mutual truth or falsehood. The former class of propositions, relating to things, I call “Primary”; the latter class, relating to propositions, I call “Secondary.” The distinction is in practice nearly but not quite co-extensive with the common logical distinction of propositions as categorical or hypothetical.
例如,“太阳照耀”、“地球变暖”等命题是主要的;“如果阳光普照,地球就会变暖”这个命题是次要的。说“太阳发光”就是说“太阳就是发光的东西”,它表达了两类事物之间的关系,即“太阳”和“发光的事物”。然而,上面给出的次要命题表达了两个主要命题“太阳照耀”和“地球变暖”之间的依赖关系。我并不特此断言这些命题之间的关系就像它们所表达的事实之间存在的关系一样,是一种因果关系,而只是说命题之间的关系如此暗示着命题之间的关系,并且被命题之间的关系如此暗示着。事实,为了逻辑的目的,它可以被用作该关系的合适代表。
For instance, the propositions, “The sun shines,” “The earth is warmed,” are primary; the proposition, “If the sun shines the earth is warmed,” is secondary. To say, “The sun shines,” is to say, “The sun is that which shines,” and it expresses a relation between two classes of things, viz., “the sun” and “things which shine.” The secondary proposition, however, given above, expresses a relation of dependence between the two primary propositions, “The sun shines,” and “The earth is warmed.” I do not hereby affirm that the relation between these propositions is, like that which exists between the facts which they express, a relation of causality, but only that the relation among the propositions so implies, and is so implied by, the relation among the facts, that it may for the ends of logic be used as a fit representative of that relation.
通常会发生这样的情况,助词if、、或,将表明一个命题是次要的;但它们并不一定意味着情况就是如此。“动物要么理性,要么非理性”这一命题是首要的。它不能被解析为“要么动物是理性的,要么动物是非理性的”,因此它并不表达后一个选言句子中连接在一起的两个命题之间的依赖关系。助词 、或,事实上并不是命题性质的标准,尽管它们更频繁地出现在次要命题中。甚至连词if也可以在基本命题中找到。“人若有智慧,则有节制”就是这样的一个例子。它不能被解析为“如果所有人都是明智的,那么所有人都是节制的”。
It will usually happen, that the particles if, either, or, will indicate that a proposition is secondary; but they do not necessarily imply that such is the case. The proposition, “Animals are either rational or irrational,” is primary. It cannot be resolved into “Either animals are rational or animals are irrational,” and it does not therefore express a relation of dependence between the two propositions connected together in the latter disjunctive sentence. The particles, either, or, are in fact no criterion of the nature of propositions, although it happens that they are more frequently found in secondary propositions. Even the conjunction if may be found in primary propositions. “Men are, if wise, then temperate,” is an example of the kind. It cannot be resolved into “If all men are wise, then all men are temperate.”
首先,如果要表达的事物的类别或集合仅由其组成的所有个体共有的名称或品质来定义,则其表达将由单个术语组成,其中表达这些名称或品质的符号将是没有任何连接符号的组合,就像乘法的代数过程一样。因此,如果x代表不透明物质,y代表抛光物质,z代表石头,我们将有,
First, If the class or collection of things to be expressed is defined only by names or qualities common to all the individuals of which it consists, its expression will consist of a single term, in which the symbols expressive of those names or qualities will be combined without any connecting sign, as if by the algebraic process of multiplication. Thus, if x represent opaque substances, y polished substances, z stones, we shall have,
对于任何其他组合,依此类推。值得注意的是,这些表达式中的每一个都满足与它所包含的各个符号相同的二元性定律。因此,
and so on for any other combination. Let it be observed, that each of these expressions satisfies the same law of duality, as the individual symbols which it contains. Thus,
等等。任何上述术语我们都将指定为“类术语”,因为它通过该类中各个成员的共同属性或名称来表达一类事物。
and so on. Any such term as the above we shall designate as a “class term,” because it expresses a class of things by means of the common properties or names of the individual members of such class.
其次,如果我们谈论一个事物的集合,其中的不同部分由不同的属性、名称或属性定义,则这些不同部分的表达式必须单独形成,然后通过符号+连接。但是,如果我们想要谈论的集合是通过从某个更广泛的集合中排除其成员的定义部分而形成的,则符号 - 必须在被排除部分的符号表达之前加上前缀。尊重这些符号的使用,可以添加一些进一步的观察。
Secondly, If we speak of a collection of things, different portions of which are defined by different properties, names, or attributes, the expressions for those different portions must be separately formed, and then connected by the sign +. But if the collection of which we desire to speak has been formed by excluding from some wider collection a defined portion of its members, the sign − must be prefixed to the symbolical expression of the excluded portion. Respecting the use of these symbols some further observations may be added.
因此,根据隐含的含义,“ x或y的事物”这一表达方式将具有两个不同的符号等价物。如果我们的意思是“是x的东西,但不是y的东西,或者是 y的东西,但不是x的东西”,则表达式将是x (1 − y ) + y (1 − x );符号x代表x,y代表y。然而,如果我们的意思是“要么是x的事物,要么是y的事物”,则表达式将是x + y (1 − x )。该表达式假定同时为x和y的事物是可接受的。它可能更完整地表达为xy + x (1 − y ) + y (1 − x )的形式;但这个表达式在添加前两个项后,仅再现了前一个项。
And thus, according to the meaning implied, the expression, “Things which are either x’s or y’s,” will have two different symbolical equivalents. If we mean, “Things which are x’s, but not y’s, or y’s, but not x’s,” the expression will be x(1 − y) + y(1 − x); the symbol x standing for x’s, y for y’s. If, however, we mean, “Things which are either x’s, or, if not x’s, then y’s,” the expression will be x + y(1 − x). This expression supposes the admissibility of things which are both x’s and y’s at the same time. It might more fully be expressed in the form xy + x(1 − y) + y(1 − x); but this expression, on addition of the two first terms, only reproduces the former one.
值得注意的是,上面给出的表达式满足对偶基本定律。因此我们有
Let it be observed that the expressions above given satisfy the fundamental law of duality. Thus we have
下文中将会看到,这只是代表“事物的类或集合”的表达一般法则的一种特殊表现。[编辑:原来的第二个方程有 { x + (1 − x )} 2 = x + y (1 − x ) ,显然是错误的。]
It will be seen hereafter, that this is but a particular manifestation of a general law of expressions representing “classes or collections of things.” [EDITOR: Original has {x + (1 − x)}2 = x + y(1 − x) for the second equation, apparently in error.]
规则——用符号 x、y 、 z等表达简单的名称或性质,用1 − x、 1 − y、 1 − z 等表示它们的反义词;由通用名称或性质定义的事物类别,通过连接相应的符号(如乘法);由彼此不同的部分组成的事物的集合,通过符号+连接这些部分的表达式。特别地,让表达式“要么 x 或 y”用 x (1 − y ) + y (1 − x )表示,当 x 和 y 表示的类互斥时,用 x + y (1 − x)当它们不排他时。类似地,让表达式“要么 x's,要么 y's,要么 z's”由 x (1 − y )(1 − z ) + y (1 − x )(1 − z ) + z (1 − x )( 1 − y ) ,当由 x 、 y 和 z表示的类被设计为互斥的,即 x + y (1 − x ) + z (1 − x )(1 − y ) ,而它们并不意味着互斥,所以在。
RULE.—Express simple names or qualities by the symbols x, y, z, &c., their contraries by 1 − x, 1 − y, 1 − z, &c.; classes of things defined by common names or qualities, by connecting the corresponding symbols as in multiplication; collections of things, consisting of portions different from each other, by connecting the expressions of those portions by the sign +. In particular, let the expression, “Either x’s or y’s,” be expressed by x(1 − y) + y(1 − x), when the classes denoted by x and y are exclusive, by x + y(1 − x) when they are not exclusive. Similarly let the expression, “Either x’s, or y’s, or z’s,” be expressed by x(1 − y)(1 − z) + y(1 − x)(1 − z) + z(1 − x)(1 − y), when the classes denoted by x, y, and z, are designed to be mutually exclusive, by x + y(1 − x) + z(1 − x)(1 − y), when they are not meant to be exclusive, and so on.
“非弹性金属”用z (1 − y )表示;
“Non-elastic metals,” will be expressed by z(1 − y);
“具有非弹性金属的弹性物质”,由y + z (1 − y ) 得出;
“Elastic substances with non-elastic metals,” by y + z(1 − y);
“硬质物质,金属除外”,x − z;
“Hard substances, except metals,” by x − z;
“金属物质,除了那些既不坚硬又没有弹性的物质”
“Metallic substances, except those which are neither hard nor elastic,” by
z − z (1 − x )(1 − y ),或z {1 − (1 − x )(1 − y )}。
z − z(1 − x)(1 − y), or by z{1 − (1 − x)(1 − y)}.
在最后一个例子中,我们真正要表达的是“金属,除了不硬、不弹性的金属”。形容词之间使用的连词通常是多余的,因此不能用象征性的方式表达。
In the last example, what we had really to express was “Metals, except not hard, not elastic, metals.” Conjunctions used between adjectives are usually superfluous, and, therefore, must not be expressed symbolically.
因此,“坚硬且有弹性的金属”相当于“硬弹性金属”,并用xyz表示。
Thus, “Metals hard and elastic,” is equivalent to “Hard elastic metals,” and expressed by xyz.
接下来的表述是“硬质物质,除了金属和非弹性的物质,以及弹性和非金属的物质”。这里“那些”一词表示硬质物质,因此该表达的真正含义是,除金属、非弹性硬质物质之外的硬质物质,以及非金属、弹性硬质物质;这个词除了扩展到它后面的两个类之外。完整的表达式为x −{ xz (1 − y ) + xy (1 − z )}; 或者,x − xz (1 − y ) − xy (1 − z )。
Take next the expression, “Hard substances, except those which are metallic and non-elastic, and those which are elastic and non-metallic.” Here the word those means hard substances, so that the expression really means, Hard substances except hard substances, metallic, non-elastic, and hard substances non-metallic, elastic; the word except extending to both the classes which follow it. The complete expression is x −{xz(1 − y) + xy(1 − z)}; or, x − xz(1 − y) − xy(1 − z).
该定义假设,可以以假定的形式表示任何函数f ( x )。该假设在以下命题中得到证实。
This definition assumes, that it is possible to represent any function f(x) in the form supposed. The assumption is vindicated in the following Proposition.
假设f ( x ) = ax + b (1 − x ),并且使x = 1,我们有f (1) = a。同样,在x = 0的同一个方程中,我们有f (0) = b。因此a和b的值被确定,并且将它们代入第一个方程,我们有
Assume then, f(x) = ax + b(1 −x), and making x = 1, we have f(1) = a. Again, in the same equation making x = 0, we have f(0) = b. Hence the values of a and b are determined, and substituting them in the first equation, we have
正如所追求的发展。方程的第二个成员充分表示函数f ( x ),无论该函数的形式是什么。……
as the development sought. The second member of the equation adequately represents the function f(x), whatever the form of that function may be. …
重印自布尔 (1854)。
Reprinted from Boole (1854).
对于科学的进步来说,没有什么比知道要问什么问题更重要的了。哥廷根大学伟大的德国数学家戴维·希尔伯特(David Hilbert,1862-1943)利用二十世纪之交国际数学家大会的契机概括地描述了数学问题及其解决方案,并用一系列数学问题向他的同事提出了挑战。 23个未解决的问题。几个问题很快就得到了解决,但第十个问题花了七十年才解决,而且解决方案以希尔伯特在 1900 年无法阐明的形式——递归不可解性的证明——来解决。第八个问题,黎曼假设,仍然悬而未决。 ; 其他措辞更含糊的问题的地位是有争议的。
Nothing is more important to the progress of science than knowing what question to ask. The great German mathematician David Hilbert (1862–1943) of the University of Göttingen used the occasion of the International Congress of Mathematicians at the turn of the twentieth century to characterize mathematical problems and their solutions generally, and to challenge his colleagues with a list of 23 unsolved problems. Several problems were solved quickly, but the tenth problem took seventy years to resolve, and the resolution came in a form—a demonstration of recursive unsolvability—that Hilbert could not have articulated in 1900. The eighth problem, the Riemann hypothesis, is still unresolved; the status of other more ambiguously worded problems is arguable.
2000 年,克莱数学研究所更新了希尔伯特的名单,列出了七个千年奖问题,并为解决每个问题提供 100 万美元的奖励(克莱数学研究所,2000 年)。迄今为止,只有一个——庞加莱猜想——得到了解决。黎曼猜想被列入千年名单;𝒫 = 𝒩 𝒫问题也是如此(第 34 章)。
In the year 2000, the Clay Mathematics Institute updated Hilbert’s list with a list of seven Millennium Prize problems, offering a $1 million dollar reward for the solution of each (Clay Mathematics Institute, 2000). To date, only one—the Poincaré conjecture—has been solved. The Riemann hypothesis is on the Millennium list; so is the 𝒫 = 𝒩𝒫 problem (chapter 34).
1900 年,人们还不知道算法的精确概念,更不用说递归不可解性了。然而希尔伯特的演讲充满了未来的暗示:他提到了“有限步骤”的过程,以及“在给定假设下或在所设想的意义上解决方案的不可能性”——对此他给出了作为一个例子,古代证明了 的非理性。十九世纪末,布尔预示的逻辑革命发展到元数学领域,即为数学本身的行为构建坚实的数学基础的程序。戈特洛布·弗雷格(Gottlob Frege,1879)通过出版《Begriffsschrift》提供了现代逻辑的第一个严格的公理化,怀特海德和罗素(Whitehead and Russell,1910)出版了他们的巨著《数学原理》的第一部分,试图将所有数学完全形式化。莱布尼茨的梦想似乎触手可及。
No precise notion of an algorithm was known in 1900, much less of recursive unsolvability. And yet Hilbert’s address is full of hints of things to come: his references to processes with a “finite number of steps,” and to “the impossibility of the solution under the given hypotheses, or in the sense contemplated”—for which he gives as an example the ancient proof of the irrationality of . In the late nineteenth century, the logical revolution foreshadowed by Boole developed into the field of metamathematics, the program to construct solid mathematical foundations for the conduct of mathematics itself. Gottlob Frege (1879) provided the first rigorous axiomatization of modern logic with the publication of Begriffsschrift, and Whitehead and Russell (1910) published the first part of their massive Principia Mathematica, an attempt to completely formalize all of mathematics. Leibniz’s dream seemed within reach.
然而,随着数学形式化的尝试变得更加有力,一些不一致和悖论出现了。希尔伯特对其乐观主义的基础不稳固感到不安,他向数学界发起挑战,要求“一劳永逸地解决数学中的基本问题”(Hilbert,1928;Van Heijenoort,1967)。这个挑战被称为希尔伯特计划,只有当挑战提出后,我们才能想象它可能无法实现。
Yet as the attempts to formalize mathematics became more robust, several inconsistencies and paradoxes emerged. Hilbert, troubled by the shaky ground on which his optimism was founded, challenged the mathematical community to “dispose of the foundational questions in mathematics as such once and for all” (Hilbert, 1928; Van Heijenoort, 1967). This challenge became known as Hilbert’s program, and only once the challenge was posed was it possible to imagine that it might not be achievable.
在希尔伯特 1900 年的演讲中,并没有太多思考这种不可能。语气是振奋和鼓励的。每个问题都可以以某种方式解决。不存在无知的情况(字面意思是,不存在“我们不会知道”——也就是说,不放弃寻找答案)。
In Hilbert’s 1900 address, there is not much thinking about such an impossibility. The tone is uplifting and encouraging. Every problem can be solved, one way or another. There is no ignorabimus (literally, no “we shall not know”—that is, no giving up on finding the answer).
事实上,这种精神渗透到了当时的人类文化中。二十世纪之交是世界普遍积极的时刻。战争是有限的且规模可控。工业革命带来了繁荣和进步,至少给西方世界带来了繁荣和进步。五千万人参观了巴黎世界博览会,亲眼目睹现代世界的奇迹。新艺术(“新艺术”)风格的浪漫、旋涡状艺术品表达了人们的普遍感受:生活是温柔的、美丽的、有机繁荣的,而且只会变得更好。
In fact this spirit informed much of human culture at the time. The turn of the twentieth century was a moment of general positivity in the world. Wars were limited and of manageable scale. The Industrial Revolution had brought prosperity and progress, at least to the Western world. Fifty million people visited the Exposition Universelle (World Fair) in Paris to see first-hand the wonders of the modern world. Romantic, swirly artworks in the art nouveau (“new art”) style spoke to the general feelings that life was gentle, beautiful, flourishing organically, and could only get better.
我们摘录希尔伯特演讲的序言。在这 23 个问题中,我们只包括第二个问题和第十个问题。第十个问题是确定丢番图方程是否在整数上可解的问题。这些方程的形式为p ( x 1 , … , x k ) = 0,其中p是多项式,其中项是整数常量乘以变量的整数次方的乘积,例如。x 1 , … , x k的值为整数的要求将看起来像代数问题的问题变成了组合问题。马丁·戴维斯 (Martin Davis)、希拉里·帕特南 (Hilary Putnam)、朱莉娅·罗宾逊 (Julia Robinson) 和尤里·马蒂亚塞维奇 (Yuri Matiyasevich) 长达数十年的努力最终证明了确定此类方程是否有解的一般问题是无法解决的,而马蒂亚塞维奇 (Matiyasevich)在 1970 年贡献了数论致命一击(Matiyasevich) ,1993)。
We excerpt the prefatory remarks of Hilbert’s address. Of the 23 problems, we include only the second and the tenth. The tenth is the problem of determining whether a diophantine equation is solvable over the integers. These are equations of the form p(x1, …, xk) = 0, where p is a polynomial in which the terms are integer constants multiplied by products of the variables raised to integer powers—for example, . The requirement that the values of x1, …, xk be integers turns what looks like an algebraic problem into a combinatorial problem. The decades-long efforts of Martin Davis, Hilary Putnam, Julia Robinson, and Yuri Matiyasevich eventually demonstrated the unsolvability of the general problem of determining whether such an equation has a solution, with Matiyasevich contributing the number-theoretic coup de grâce in 1970 (Matiyasevich, 1993).
希尔伯特发表讲话十五年后,欧洲正陷入一场血腥、看似毫无意义且无休无止的战争的深渊。文学变得讽刺和愤世嫉俗;视觉艺术中出现了黑暗和暴力的主题。仿佛反映了悲观主义的转变,在两次世界大战之间,哥德尔和图灵展示了象征性地形式化离散步骤的有限过程的想法所带来的意想不到的后果。正式系统本身似乎有其局限性。
Fifteen years after Hilbert’s address, Europe was in the depths of a bloody, seemingly senseless and endless war. Literature turned ironic and cynical; dark and violent themes emerged in the visual arts. As though reflecting the pessimistic shift, between the two World Wars Gödel and Turing demonstrated the unexpected consequences of formalizing symbolically the idea of a finite process of discrete steps. Formal systems themselves have their limits, it seemed.
20 世纪 20 年代,希尔伯特的健康状况每况愈下,而在他周围成长起来的哥廷根数学学院在纳粹清除犹太教授后几乎全部解散。希尔伯特的墓碑上刻着(德语)他的勇敢话语:“我们必须知道。我们会知道的。” 他在 1930 年说过这句话,就在库尔特·哥德尔宣布希尔伯特第二个问题(关于公理的独立性和一致性)的一系列调查结果的前一天。哥德尔证明,在任何强大到足以使基本算术公理化的递归公理化一致逻辑系统中,一些真命题无法被证明,从而混淆了对已知问题的简单答案的任何尝试。
Hilbert’s health declined during the 1920s, and the great school of mathematicians that had grown around him at Göttingen was all but dissolved when the Nazis purged it of Jewish professors. Hilbert’s gravestone bears (in German) his brave words, “We must know. We shall know.” He had said this in 1930, on the day before Kurt Gödel announced a result in a line of inquiry stimulated by Hilbert’s second problem, about the independence and consistency of axioms. Gödel proved that in any recursively axiomatizable consistent logical system strong enough to axiomatize basic arithmetic, some true propositions could not be proved—thus confounding any attempt at a simple answer to the question of what can be known.
然而,哥德尔和图灵在关上一扇门的同时,也打开了无数扇门。哥德尔证明数学永远不可能是一个单一的封闭系统,这也意味着它的可能性是无限的。图灵通过定义什么是算法来证明算法不能做什么,开启了算法研究的理论和实践科学分析,并开创了当今的计算世界。
And yet, in closing one door, Gödel and Turing opened countless others. Gödel’s proof that mathematics can never be a single closed system also meant that its possibilities are limitless. Turing, in defining what an algorithm is in order to prove what algorithms cannot do, opened the study of algorithms to scientific analysis, both theoretical and practical, and ushered in the computational world of today.
我们谁都不愿意揭开隐藏未来的面纱;看看我们科学的下一步进展及其在未来几个世纪发展的秘密?未来几代领先的数学精神将努力实现哪些具体目标?新世纪将在广阔而丰富的数学思想领域揭示哪些新方法和新事实?
WHO of us would not be glad to lift the veil behind which the future lies hidden; to cast a glance at the next advances of our science and at the secrets of its development during future centuries? What particular goals will there be toward which the leading mathematical spirits of coming generations will strive? What new methods and new facts in the wide and rich field of mathematical thought will the new centuries disclose?
历史告诉我们科学发展的连续性。我们知道,每个时代都有自己的问题,下一个时代要么解决了这些问题,要么认为这些问题无利可图而被抛弃并被新的问题所取代。如果我们想了解数学知识在不久的将来可能的发展,我们就必须把悬而未决的问题抛在脑后,并审视当今科学提出的问题以及我们期望未来得到的解决方案。在我看来,在几个世纪的会议上对当今的问题进行这样的回顾是非常适合的。一个伟大时代的结束,不仅让我们回顾过去,也让我们思考未知的未来。
History teaches the continuity of the development of science. We know that every age has its own problems, which the following age either solves or casts aside as profitless and replaces by new ones. If we would obtain an idea of the probable development of mathematical knowledge in the immediate future, we must let the unsettled questions pass before our minds and look over the problems which the science of today sets and whose solution we expect from the future. To such a review of problems the present day, lying at the meeting of the centuries, seems to me well adapted. For the close of a great epoch not only invites us to look back into the past but also directs our thoughts to the unknown future.
某些问题对于数学科学整体进步的深刻意义以及它们在个别研究者的工作中所发挥的重要作用是不容否认的。只要科学的一个分支能够提出大量的问题,它就能存在多久;没有问题就预示着消亡或独立发展的停止。正如人类的每一项事业都追求特定的目标一样,数学研究也需要它的问题。调查者通过解决问题来检验他的钢铁品质。他找到了新的方法、新的观点,获得了更广阔、更自由的视野。
The deep significance of certain problems for the advance of mathematical science in general and the important role which they play in the work of the individual investigator are not to be denied. As long as a branch of science offers an abundance of problems, so long is it alive; a lack of problems foreshadows extinction or the cessation of independent development. Just as every human undertaking pursues certain objects, so also mathematical research requires its problems. It is by the solution of problems that the investigator tests the temper of his steel; he finds new methods and new outlooks, and gains a wider and freer horizon.
提前正确判断问题的价值是很困难的,而且往往是不可能的;因为最终的奖项取决于科学从问题中获得的收益。尽管如此,我们还是可以问是否存在标志着一个好的数学问题的一般标准。一位法国老数学家说过:“一个数学理论只有在你能够向街上遇到的第一个人解释清楚之后才能被认为是完整的。” 这里对于数学理论所坚持的清晰易懂,我更应该要求数学问题如果要完美的话;因为清晰易懂的事物会吸引我们,复杂的事物会令我们排斥。
It is difficult and often impossible to judge the value of a problem correctly in advance; for the final award depends upon the gain which science obtains from the problem. Nevertheless we can ask whether there are general criteria which mark a good mathematical problem. An old French mathematician said: “A mathematical theory is not to be considered complete until you have made it so clear that you can explain it to the first man whom you meet on the street.” This clearness and ease of comprehension, here insisted on for a mathematical theory, I should still more demand for a mathematical problem if it is to be perfect; for what is clear and easily comprehended attracts, the complicated repels us.
此外,数学问题应该很困难才能吸引我们,但又不能完全难以解决,以免它嘲笑我们的努力。对于我们来说,它应该是通往隐藏真相的迷宫之路的指南,并最终提醒我们对成功解决方案的喜悦。过去几个世纪的数学家习惯于以极大的热情致力于解决困难的特定问题。他们知道困难问题的价值。我只提醒你约翰·伯努利提出的“最快下降线问题”。伯努利在公开宣布这个问题时解释说,经验告诉我们,只有通过向他们提出困难但同时又有用的问题,才能引导崇高的思想者为科学的进步而奋斗,因此他希望赢得感谢追随梅森、帕斯卡、费马、维维亚尼等人的榜样,向当时杰出的分析家提出一个问题,作为试金石,他们可以通过这个问题来检验数学世界的发展他们的方法的价值并衡量他们的力量。变分法起源于伯努利问题和类似问题。
Moreover a mathematical problem should be difficult in order to entice us, yet not completely inaccessible, lest it mock at our efforts. It should be to us a guide post on the mazy paths to hidden truths, and ultimately a reminder of our pleasure in the successful solution. The mathematicians of past centuries were accustomed to devote themselves to the solution of difficult particular problems with passionate zeal. They knew the value of difficult problems. I remind you only of the “problem of the line of quickest descent,” proposed by John Bernoulli. Experience teaches, explains Bernoulli in the public announcement of this problem, that lofty minds are led to strive for the advance of science by nothing more than by laying before them difficult and at the same time useful problems, and he therefore hopes to earn the thanks of the mathematical world by following the example of men like Mersenne, Pascal, Fermat, Viviani and others and laying before the distinguished analysts of his time a problem by which, as a touchstone, they may test the value of their methods and measure their strength. The calculus of variations owes its origin to this problem of Bernoulli and to similar problems.
众所周知,费马断言丢番图方程
Fermat had asserted, as is well known, that the diophantine equation
(x、y和z整数)是无解的——除非在某些不言而喻的情况下。证明这种不可能性的尝试提供了一个引人注目的例子,说明这样一个非常特殊且显然不重要的问题可能对科学产生鼓舞人心的影响。对于库默来说,受费马问题的启发,引入了理想数,并发现了将圆域的数唯一分解为理想素因子的定律——这一定律今天在推广到任何代数时都得到了应用。戴德金和克罗内克的领域处于现代数论的中心,其意义远远超出了数论的界限,进入了代数和函数论领域。
(x, y and z integers) is unsolvable—except in certain self-evident cases. The attempt to prove this impossibility offers a striking example of the inspiring effect which such a very special and apparently unimportant problem may have upon science. For Kummer, incited by Fermat’s problem, was led to the introduction of ideal numbers and to the discovery of the law of the unique decomposition of the numbers of a circular field into ideal prime factors—a law which today, in its generalization to any algebraic field by Dedekind and Kronecker, stands at the center of the modern theory of numbers and whose significance extends far beyond the boundaries of number theory into the realm of algebra and the theory of functions.
谈到一个非常不同的研究领域,我提醒您三个机构的问题。庞加莱在天体力学中引入的富有成效的方法和影响深远的原理,以及今天在实际天文学中得到认可和应用的,都是由于他致力于重新处理这个难题并接近解决方案。
To speak of a very different region of research, I remind you of the problem of three bodies. The fruitful methods and the far-reaching principles which Poincaré has brought into celestial mechanics and which are today recognized and applied in practical astronomy are due to the circumstance that he undertook to treat anew that difficult problem and to approach nearer a solution.
最后提到的两个问题——费马问题和三体问题——在我们看来几乎就像相反的两极——前者是纯粹理性的自由发明,属于抽象数论领域,后者是天文学强加给我们的对于理解自然最简单的基本现象是必要的。
The two last mentioned problems—that of Fermat and the problem of the three bodies—seem to us almost like opposite poles—the former a free invention of pure reason, belonging to the region of abstract number theory, the latter forced upon us by astronomy and necessary to an understanding of the simplest fundamental phenomena of nature.
但也经常发生这样的情况:同一特殊问题在最不同的数学知识分支中得到应用。因此,例如,最短线问题在几何基础、曲线和曲面理论、力学和变分法中起着主要且具有历史意义的重要作用。F.克莱因在他关于二十面体的著作中,多么令人信服地描绘了正多面体问题在初等几何、群论、方程论和线性微分方程中的重要性。……
But it often happens also that the same special problem finds application in the most unlike branches of mathematical knowledge. So, for example, the problem of the shortest line plays a chief and historically important part in the foundations of geometry, in the theory of curved lines and surfaces, in mechanics and in the calculus of variations. And how convincingly has F. Klein, in his work on the icosahedron, pictured the significance which attaches to the problem of the regular polyhedra in elementary geometry, in group theory, in the theory of equations and in that of linear differential equations. …
仍然需要简要讨论解决数学问题时可以合理地规定哪些一般要求。我首先要说的是:通过基于有限数量的假设的有限步骤,可以确定解决方案的正确性,这些假设隐含在问题的陈述中,并且必须始终是完全制定。这种通过有限数量的过程进行逻辑推演的要求,就是对推理严谨性的要求。事实上,严谨性的要求在数学中已成为众所周知,它符合我们理解的普遍哲学必然性。另一方面,只有满足了这一要求,问题的思想内容和暗示性才能充分发挥作用。一个新问题,尤其是当它来自外部经验世界时,就像一根年轻的树枝,它会茁壮成长并结出果实只有当它按照严格的园艺规则小心地嫁接在老茎上时,我们数学科学的既定成就才得以实现。……
It remains to discuss briefly what general requirements may be justly laid down for the solution of a mathematical problem. I should say first of all, this: that it shall be possible to establish the correctness of the solution by means of a finite number of steps based upon a finite number of hypotheses which are implied in the statement of the problem and which must always be exactly formulated. This requirement of logical deduction by means of a finite number of processes is simply the requirement of rigor in reasoning. Indeed the requirement of rigor, which has become proverbial in mathematics, corresponds to a universal philosophical necessity of our understanding; and, on the other hand, only by satisfying this requirement do the thought content and the suggestiveness of the problem attain their full effect. A new problem, especially when it comes from the world of outer experience, is like a young twig, which thrives and bears fruit only when it is grafted carefully and in accordance with strict horticultural rules upon the old stem, the established achievements of our mathematical science. …
关于数学问题可能带来的困难以及克服这些困难的方法的一些评论可能在这里。
Some remarks upon the difficulties which mathematical problems may offer, and the means of surmounting them, may be in place here.
如果我们没有成功地解决数学问题,原因往往在于我们未能认识到更普遍的观点,从这个观点来看,我们面前的问题只是一系列相关问题中的一个环节。找到了这个立场之后,我们不仅对这个问题的研究往往更容易理解,而且同时我们也掌握了一种同样适用于相关问题的方法。柯西提出的复杂积分路径和库默提出的数论理想概念可以作为例子。这种寻找通用方法的方式无疑是最实用、最确定的;因为,如果一个人在头脑中没有明确的问题而寻求方法,那么他的寻求大部分都是徒劳的。
If we do not succeed in solving a mathematical problem, the reason frequently consists in our failure to recognize the more general standpoint from which the problem before us appears only as a single link in a chain of related problems. After finding this standpoint, not only is this problem frequently more accessible to our investigation, but at the same time we come into possession of a method which is applicable also to related problems. The introduction of complex paths of integration by Cauchy and of the notion of the IDEALS in number theory by Kummer may serve as examples. This way for finding general methods is certainly the most practicable and the most certain; for he who seeks for methods without having a definite problem in mind seeks for the most part in vain.
我认为,在处理数学问题时,专业化比泛化发挥着更重要的作用。也许在大多数情况下,我们徒劳地寻找问题的答案,失败的原因在于,比当前问题更简单、更容易的问题要么根本没有解决,要么没有完全解决。那么,一切都取决于找出这些更简单的问题,并通过尽可能完美的设备和能够概括的概念来解决它们。这条规则是克服数学困难的最重要的杠杆之一,在我看来,它几乎总是被使用,尽管可能是无意识的。
In dealing with mathematical problems, specialization plays, as I believe, a still more important part than generalization. Perhaps in most cases where we seek in vain the answer to a question, the cause of the failure lies in the fact that problems simpler and easier than the one in hand have been either not at all or incompletely solved. All depends, then, on finding out these easier problems, and on solving them by means of devices as perfect as possible and of concepts capable of generalization. This rule is one of the most important levers for overcoming mathematical difficulties and it seems to me that it is used almost always, though perhaps unconsciously.
有时我们会在不充分的假设下或在不正确的意义上寻求解决方案,因此不会成功。那么问题就出现了:证明在给定的假设下或在所设想的意义上解决方案是不可能的。这种不可能性的证明受到古人的影响,例如,当他们证明等腰直角三角形的斜边与边的比率是无理数时。在后来的数学中,关于某些解的不可能性的问题起着重要作用,我们以这种方式感知古老而困难的问题,例如平行公理的证明、圆的平方或解根式五次方程终于找到了完全令人满意且严格的解,尽管其意义与最初的意图不同。可能正是这个重要的事实和其他哲学原因一起产生了这样的信念(每个数学家都认同这一点,但尚未有人得到证明的支持):每个确定的数学问题都必然能够得到精确的解决,或者以对所提出问题的实际答案的形式,或通过证明其解决方案的不可能性以及因此所有尝试必然失败的形式。以任何明确的未解决问题为例,例如有关欧拉-马斯切罗尼常数C的无理性的问题,或者无限多个 2 n + 1形式的素数的存在性问题。无论这些问题对我们来说多么难以解决,但尽管我们无助地站在它们面前,但我们坚信,它们的解决方案必须遵循有限数量的纯逻辑过程。
Occasionally it happens that we seek the solution under insufficient hypotheses or in an incorrect sense, and for this reason do not succeed. The problem then arises: to show the impossibility of the solution under the given hypotheses, or in the sense contemplated. Such proofs of impossibility were effected by the ancients, for instance when they showed that the ratio of the hypotenuse to the side of an isosceles right triangle is irrational. In later mathematics, the question as to the impossibility of certain solutions plays a preeminent part, and we perceive in this way that old and difficult problems, such as the proof of the axiom of parallels, the squaring of the circle, or the solution of equations of the fifth degree by radicals have finally found fully satisfactory and rigorous solutions, although in another sense than that originally intended. It is probably this important fact along with other philosophical reasons that gives rise to the conviction (which every mathematician shares, but which no one has as yet supported by a proof) that every definite mathematical problem must necessarily be susceptible of an exact settlement, either in the form of an actual answer to the question asked, or by the proof of the impossibility of its solution and therewith the necessary failure of all attempts. Take any definite unsolved problem, such as the question as to the irrationality of the Euler-Mascheroni constant C, or the existence of an infinite number of prime numbers of the form 2n + 1. However unapproachable these problems may seem to us and however helpless we stand before them, we have, nevertheless, the firm conviction that their solution must follow by a finite number of purely logical processes.
每个问题都可解决的这一公理是否只是数学思想的独特特征,或者它可能是心灵本质中固有的普遍法则,即它提出的所有问题都必须是可回答的?因为在其他科学中,人们也会遇到老问题,这些问题已通过证明其不可能性而以对科学最令人满意和最有用的方式得到解决。我以永动机问题为例。在徒劳地寻求建造永动机之后,我们研究了如果这种机器不可能实现的话,自然力之间必须存在的关系。这个颠倒的问题导致了能量守恒定律的发现,该定律再次解释了最初意义上的永动机的不可能性。
Is this axiom of the solvability of every problem a peculiarity characteristic of mathematical thought alone, or is it possibly a general law inherent in the nature of the mind, that all questions which it asks must be answerable? For in other sciences also one meets old problems which have been settled in a manner most satisfactory and most useful to science by the proof of their impossibility. I instance the problem of perpetual motion. After seeking in vain for the construction of a perpetual motion machine, the relations were investigated which must subsist between the forces of nature if such a machine is to be impossible; and this inverted question led to the discovery of the law of the conservation of energy, which, again, explained the impossibility of perpetual motion in the sense originally intended.
对每个数学问题都可以解决的信念对工人来说是一种强大的激励。我们听到内心不断的呼唤:问题来了。求其解决办法。你可以通过纯粹的理性找到它,因为数学中不存在无知者。……
This conviction of the solvability of every mathematical problem is a powerful incentive to the worker. We hear within us the perpetual call: There is the problem. Seek its solution. You can find it by pure reason, for in mathematics there is no ignorabimus. …
2.算术公理的兼容性_ _ _ _
2. THE COMPATIBILITY OF THE ARITHMETICAL AXIOMS
当我们致力于研究一门科学的基础时,我们必须建立一个公理体系,其中包含对该科学基本思想之间存在的关系的准确而完整的描述。如此建立的公理同时也是那些基本思想的定义。在我们正在测试其基础的科学领域内,任何陈述都不能被认为是正确的,除非它可以通过有限数量的逻辑步骤从这些公理中推导出来。经过仔细考虑,问题出现了:单个公理的某些陈述是否以任何方式相互依赖,以及这些公理是否因此可能不包含某些共同部分,如果人们希望得出一个公理系统,则必须将这些共同部分分离出来彼此之间应完全独立。
When we are engaged in investigating the foundations of a science, we must set up a system of axioms which contains an exact and complete description of the relations subsisting between the elementary ideas of that science. The axioms so set up are at the same time the definitions of those elementary ideas; and no statement within the realm of the science whose foundation we are testing is held to be correct unless it can be derived from those axioms by means of a finite number of logical steps. Upon closer consideration the question arises: Whether, in any way, certain statements of single axioms depend upon one another, and whether the axioms may not therefore contain certain parts in common, which must be isolated if one wishes to arrive at a system of axioms that shall be altogether independent of one another.
但最重要的是,我希望将以下问题指定为关于公理可以提出的众多问题中最重要的一个:证明它们并不矛盾,也就是说,基于它们的有限数量的逻辑步骤永远不可能导致矛盾的结果。 ……
But above all I wish to designate the following as the most important among the numerous questions which can be asked with regard to the axioms: To prove that they are not contradictory, that is, that a finite number of logical steps based upon them can never lead to contradictory results. …
10.丢番图方程可解性的测定
10. DETERMINATION OF THE SOLVABILITY OF A DIOPHANTINE EQUATION
给定一个具有任意数量的未知量和有理积分数值系数的丢番图方程:设计一个过程,根据该过程可以通过有限数量的运算确定该方程是否可解为有理整数。 ……
Given a diophantine equation with any number of unknown quantities and with rational integral numerical coefficients: To devise a process according to which it can be determined by a finite number of operations whether the equation is solvable in rational integers. …
重印自希尔伯特 (1902)。
Reprinted from Hilbert (1902).
“Entscheidungsproblem”在德语中是“决策问题”的意思,带有定冠词,指的是确定任意数学命题是否可证明的问题。1928 年,数学的逻辑基础已经足够牢固,大卫·希尔伯特和他的学生威廉·阿克曼可以规定所讨论的公理系统是谓词演算(这里称为“泛函演算 K” )。人们正在寻找一种能够有效回答所有数学问题并完成希尔伯特程序的程序。许多研究人员为谓词演算的片段提供了决策程序,但起初并没有认真研究不可能证明——部分原因是“决策程序”的概念仅仅是直观的。假设可能不存在任何过程,则需要限制所有过程的类。
“Entscheidungsproblem” is German for “decision problem,” and with the definite article refers to the problem of determining whether an arbitrary mathematical proposition is provable. In 1928, the logical foundations of mathematics had been firmly enough established that David Hilbert and his student Wilhelm Ackermann could stipulate that the axiomatic system in question was the predicate calculus (referred to here as the “functional calculus K”). The hunt was on to find a procedure that would, in effect, answer all mathematical questions and complete Hilbert’s program. Various researchers offered decision procedures for fragments of the predicate calculus, but there was at first no serious work toward an impossibility proof—in part because the notion of a “decision procedure” was merely intuitive. Imagining that no procedure might exist required that the class of all procedures be circumscribed.
1930年,库尔特·哥德尔宣布了他开创性的否定结果(Gödel,1931),即在任何像《数学原理》(怀特海德和罗素,1910)或谓词演算这样的系统中,有些命题既不可证明,也不可证伪(除非系统本身不一致) ,在这种情况下一切都是可证明的)。为此,哥德尔利用了康托在 1891 年使用的对角化论证的一个版本来证明实数不可数(按照图灵的用法,不是“可枚举的”)。哥德尔的另一项创新是将有限的符号串(例如谓词演算的公式)编码为正整数,使用前 k 个素数的乘积得到长度为 k 的字符串,并计算第k个素数的指数编码字符串中第k个符号的数字。
In 1930 Kurt Gödel announced his groundbreaking negative result (Gödel, 1931), that in any system like that of Principia Mathematica (Whitehead and Russell, 1910) or the predicate calculus, some propositions were neither provable nor disprovable (unless the system itself is inconsistent, in which case everything is provable). To do this, Gödel exploited a version of the diagonalization argument that Cantor had used in 1891 to prove that the real numbers were not countable (not “enumerable,” to follow Turing’s usage). Gödel’s other innovation was to encode a finite string of symbols (a formula of the predicate calculus, for example) as a positive integer, using a product of the first k primes for a string of length k and making the exponent of the kth prime a digit encoding the kth symbol of the string.
哥德尔的结果令人震惊,但它并没有解决Entscheidungs问题。(尽管正如图灵在第 60 页所解释的那样,如果哥德尔证明了与他所证明的相反的结果,他就会从积极的角度解决 Entscheidungs 问题。) 阿朗佐·丘奇和图灵很快以非常不同的方式形式化了“算法”的概念。但同样的方式,Church (1936b) 首先通过 lambda 演算,图灵随后不久通过现在所谓的图灵机。两人都因证明了Entscheidungs问题的不可解决性而受到赞誉。
Gödel’s result was stunning, but it did not settle the Entscheidungsproblem. (Though as Turing explains on page 60, had Gödel proved the opposite of what he did prove, he would have settled the Entscheidungsproblem in the positive.) In rapid succession, Alonzo Church and Turing formalized the notion of an “algorithm” in very different but equivalent ways, Church (1936b) first via the lambda-calculus, and Turing shortly after via what are now called Turing machines. Both are credited with showing the unsolvability of the Entscheidungsproblem.
每个作者都面临着双重任务:证明他的定理,并让读者相信他的计算过程类足够大,足以涵盖所有可以想象的计算过程。由于 lambda 演算的深奥性,Church 的论文在后一点上没有说服力——尽管 lambda 演算后来成为函数式编程的基础(第 21 章)。
Each author was faced with the dual tasks of proving his theorem, and of convincing his readers that his class of computational procedures was large enough to encompass all imaginable computational processes. Because of the abstruseness of the lambda-calculus, Church’s paper was unpersuasive on the latter point—even though the lambda-calculus would later become the basis for functional programming (chapter 21).
因此,图灵的机器必须足够简单,能够证明它们不能做什么,但又必须足够强大,能够执行任何有理智的人称之为计算的事情。我们包含两个参数的部分内容,但省略了所有代码。事实证明,图灵机器的“在盒子里写一个符号并移动到相邻盒子”的还原论已经被数学家埃米尔·波斯特预见到了,然而,他并没有采取进一步的行动。进行证明所需的步骤。波斯特的作品提到了 Church (1936b),但当时尚未出版,出现在 Davis (1965,第 289-291 页)中。
Thus Turing’s machines had to be simple enough to admit a proof of what they could not do, and yet demonstrably powerful enough to carry out anything that a reasonable person would call a computation. We include parts of both arguments, but we omit all the code. It turned out that the write-a-symbol-in-a-box-and-move-to-an-adjacent-box reductionism of Turing’s machines had been anticipated by the mathematician Emil Post, who, however, did not take the further steps needed to carry out the proof. Post’s work, which refers to Church (1936b) but was unpublished at the time, appears in Davis (1965, pp. 289–291).
这将有助于润饰图灵的语言。他的机器(他使用“计算机”一词仅指进行计算的人类)的设计只是为了在空白磁带上启动并开始工作。他们打印 0 和 1(他称之为“数字”),也许还散布着其他符号。可能无限的一系列数字是以小数点开头的,因此表示 0 到 1 之间的实数。有些这样的实数是可计算的,有些则不可计算(因为机器的数量是可数的,但机器的数量是可数的)。 [0,1] 中的实数个数不是)。很容易将可计算数视为表示可计算非负整数集合的一种方式(计算[0, 1] 中实数的第i位相当于确定整数i是否在集合中对应于该实数),或使用不同编码系统从整数到整数的可计算函数等。
It will be helpful to gloss Turing’s language. His machines (he used the term “computer” only to refer to a human being carrying out a computation) are designed simply to start up on blank tape and go to work. They print 0s and 1s (what he calls “figures”), perhaps interspersed with other symbols. The possibly infinite series of figures is to be read with a decimal point at the beginning, and so represents a real number between 0 and 1. Some such real numbers are computable, and some are not (because the number of machines is countable but the number of reals in [0,1] is not). It is easy to see a computable number as a way of representing a computable set of nonnegative integers (calculating the ith bit of a real number in [0, 1] is equivalent to determining whether the integer i is or is not in the set corresponding to that real number), or a computable function from integers to integers using a different coding system, etc.
图灵所说的m配置现在被称为机器的状态,并且通常使用字母q的变体来表示。相比之下,“配置”由状态和扫描符号组成,即决定下一步行动的所有内容,而“完整配置”还包括磁带的全部内容和扫描头的位置 -关于机器在特定运行时刻的所有信息。
What Turing calls an m-configuration would today be called a state of the machine, and is generally denoted using a variant on the letter q. A “configuration,” by contrast, consists of both the state and the scanned symbol, that is, everything that determines the next move, and a “complete configuration” includes also the full contents of the tape and the location of the scanning head—everything there is to say about the machine at a particular moment of its operation.
“圆形”机器是只打印有限多个图形的机器,而“无圆”机器是打印无限系列图形的机器。
A “circular” machine is one that prints only finitely many figures, and a “circle-free” machine is one that prints an infinite series of figures.
图灵的“通用”机器是一种可以模拟任何其他机器的机器。为了提供一台机器作为通用机器的输入,图灵需要通过固定字母表对任意机器进行编码。他将状态q i表示为DA i,将符号S j表示为DC j。这样编码的四元组的串联就是图灵称为“标准描述”(SD)的字符串,这是存储程序的第一个清晰实例:通用存储器可以用来存储程序,因此程序之间没有本质区别和数据。图灵用十进制数字替换了通用代码中的几个单独符号,为每台机器导出了一个数值,他称之为“描述号”或 DN——这是一种与哥德尔设计的不同的将字符串编码为数字的技术。从通用机器的存在来看,图灵能够通过对角化证明没有机器能够可靠地区分圆形机器和无圆机器。再经过几个步骤,Entscheidungs 问题就变得无法解决了。(我们省略了逻辑公式中代表机器计算的大部分结构——细节有缺陷,后来得到了纠正[Turing, 1938]。)
Turing’s “universal” machine is one that can simulate any other. To provide a machine as input to the universal machine, Turing needed an encoding of arbitrary machines over a fixed alphabet. He represented state qi as DAi and symbol Sj by DCj. The concatenation of the quadruples thus encoded was a string Turing called the “standard description” (S.D), the first clear instantiation of a stored program: the general purpose memory could be used to store a program, so there was no essential difference between program and data. Replacing the few individual symbols in this universal code by decimal digits, Turing derived a numerical value for each machine, what he called its “description number” or D.N—a different technique for encoding strings as numbers than the one Gödel had devised. From the existence of the universal machine Turing is able to show by diagonalization that no machine can reliably distinguish circular from circle-free machines. In a couple of further steps, the unsolvability of the Entscheidungsproblem followed. (We omit most of the construction representing a machine computation in a logical formula—the details were flawed and later corrected [Turing, 1938].)
安德鲁·霍奇斯 (Andrew Hodges) 的传记(1983 年)详细记录了艾伦·马西森·图灵(Alan Mathison Turing,1912-1954 年)的一生,电影《模仿游戏》就是根据该传记改编的。图灵写了这篇论文两年后获得剑桥数学一等荣誉,然后在普林斯顿大学丘奇学院攻读博士学位。他的数学生涯转向支持二战期间的密码破译工作。他被揭露为同性恋,失去了安全许可并被捕。在接受化学阉割作为监禁的替代方法后,他死于氰化物中毒——显然是自杀,尽管有些人注意到有证据表明死亡可能是意外。他去世时年仅 41 岁,正如科普作家切特·雷莫 (Chet Raymo,1996) 所说,他是“一位输给非理性的逻辑巨人”。2014 年,伊丽莎白女王赦免了图灵,将他的猥亵定罪从记录中抹去;2019 年,英格兰银行宣布他的形象将出现在50英镑的钞票上。
The life of Alan Mathison Turing (1912–1954) is well documented in the biography by Andrew Hodges (1983), on which the film The Imitation Game is based. Turing wrote this paper two years after receiving first class honours in mathemactics from Cambridge, and then studied for his PhD under Church at Princeton. His mathematical career was diverted in support of the codebreaking effort during World War II. Outed as a homosexual, he lost his security clearance and was arrested. Having accepted chemical castration as an alternative to imprisonment, he died of cyanide poisoning—apparently by suicide, though some have noted evidence that the death might have been accidental. Only 41 years old at the time of his death, he was “a giant of logic lost to the irrational,” as science writer Chet Raymo (1996) put it. Queen Elizabeth posthumously pardoned Turing in 2014, wiping his indecency conviction from the record, and in 2019 the Bank of England announced that his image would appear on the £50 banknote.
“可计算”数可以简单地描述为实数,其小数形式可以通过有限方式计算。尽管本文的主题表面上是可计算数,但定义和研究整数变量或实数或可计算变量、可计算谓词等的可计算函数几乎同样容易。然而,所涉及的基本问题在每种情况下都是相同的,并且我选择了可计算的数字来进行明确的处理,因为涉及最不麻烦的技术。我希望很快就能对可计算的数字、函数等等之间的关系进行说明。这将包括以可计算数表示的实变量函数理论的发展。根据我的定义,如果一个数字的小数可以被机器写下来,那么这个数字就是可计算的。
THE “computable” numbers may be described briefly as the real numbers whose expressions as a decimal are calculable by finite means. Although the subject of this paper is ostensibly the computable numbers, it is almost equally easy to define and investigate computable functions of an integral variable or a real or computable variable, computable predicates, and so forth. The fundamental problems involved are, however, the same in each case, and I have chosen the computable numbers for explicit treatment as involving the least cumbrous technique. I hope shortly to give an account of the relations of the computable numbers, functions, and so forth to one another. This will include a development of the theory of functions of a real variable expressed in terms of computable numbers. According to my definition, a number is computable if its decimal can be written down by a machine.
在第6.9节和第 6.10 节中,我给出了一些论证,旨在表明可计算数包括所有自然可以被视为可计算的数。特别是,我证明了某些大类数字是可计算的。例如,它们包括所有代数数的实部、贝塞尔函数零点的实部、数字π、e等。然而,可计算数并不包括所有可定义的数,一个例子是给定一个不可计算的可定义数字。
In §§6.9, 6.10 I give some arguments with the intention of showing that the computable numbers include all numbers which could naturally be regarded as computable. In particular, I show that certain large classes of numbers are computable. They include, for instance, the real parts of all algebraic numbers, the real parts of the zeros of the Bessel functions, the numbers π, e, etc. The computable numbers do not, however, include all definable numbers, and an example is given of a definable number which is not computable.
尽管可计算数的类别如此之大,并且在许多方面与实数的类别相似,但它仍然是可枚举的。在第6.8节中,我研究了某些似乎证明相反的论点。通过正确应用这些论证之一,得出的结论表面上与哥德尔(1931)的结论相似。这些结果具有有价值的应用。特别是,它表明(§ 6.11)希尔伯特Entscheidungs问题可能没有解决方案。
Although the class of computable numbers is so great, and in many ways similar to the class of real numbers, it is nevertheless enumerable. In §6.8 I examine certain arguments which would seem to prove the contrary. By the correct application of one of these arguments, conclusions are reached which are superficially similar to those of Gödel (1931). These results have valuable applications. In particular, it is shown (§6.11) that the Hilbertian Entscheidungsproblem can have no solution.
在最近的一篇论文中,Alonzo Church(1936b)引入了“有效可计算性”的概念,这相当于我的“可计算性”,但定义却截然不同。Church 也对 Entscheidungsproblem 得出了类似的结论(Church,1936a)。本文附录中概述了“可计算性”和“有效可计算性”之间的等价性证明。[编辑:此处省略附录。]
In a recent paper Alonzo Church (1936b) has introduced an idea of “effective calculability,” which is equivalent to my “computability,” but is very differently defined. Church also reaches similar conclusions about the Entscheidungsproblem (Church, 1936a). The proof of equivalence between “computability” and “effective calculability” is outlined in an appendix to the present paper. [EDITOR: Appendix omitted here.]
我们说过,可计算数是那些小数可以通过有限的方法计算出来的数。这需要更明确的定义。在我们到达第6.9节之前,不会真正尝试证明所给出的定义的合理性。目前我只想说,其合理性在于人类的记忆必然是有限的。
We have said that the computable numbers are those whose decimals are calculable by finite means. This requires rather more explicit definition. No real attempt will be made to justify the definitions given until we reach §6.9. For the present I shall only say that the justification lies in the fact that the human memory is necessarily limited.
我们可以将计算实数过程中的人与机器进行比较,机器只能处理有限数量的条件q 1 , q 2 , … , q R,这将被称为“ m配置”。该机器配备了一条贯穿其中的“带子”(类似于纸),并分为多个部分(称为“方块”),每个部分都可以承载一个“符号”。在任何时刻都只有一个方格,比如第r个带有符号𝔖 ( r ) 的方格,即“在机器中”。我们可以将这个正方形称为“扫描正方形”。扫描方块上的符号可称为“扫描符号”。可以说,“扫描符号”是机器“直接感知”的唯一符号。然而,通过改变其m配置,机器可以有效地记住它之前“看到”(扫描)过的一些符号。机器在任何时刻可能的行为由m配置q n和扫描符号𝔖 ( r ) 决定。这对q n , 𝔖 ( r ) 将被称为“配置”:因此配置决定了机器可能的行为。在扫描的方块是空白的(即不带有符号)的一些配置中,机器在扫描的方块上写下新的符号:在其他配置中,它擦除扫描的符号。机器还可以改变正在扫描的正方形,但只能将其向右或向左移动一位。除了这些操作中的任何一个之外,还可以更改m配置。写下的一些符号将形成数字序列,即正在计算的实数的小数。其他的只是粗略的笔记,以“帮助记忆”。只有这些粗略的笔记才容易被删除。
We may compare a man in the process of computing a real number to a machine which is only capable of a finite number of conditions q1, q2, …, qR which will be called “m-configurations.” The machine is supplied with a “tape” (the analogue of paper) running through it, and divided into sections (called “squares”) each capable of bearing a “symbol.” At any moment there is just one square, say the rth bearing the symbol 𝔖(r) which is “in the machine.” We may call this square the “scanned square.” The symbol on the scanned square may be called the “scanned symbol.” The “scanned symbol” is the only one of which the machine is, so to speak, “directly aware.” However, by altering its m-configuration the machine can effectively remember some of the symbols which it has “seen” (scanned) previously. The possible behaviour of the machine at any moment is determined by the m-configuration qn and the scanned symbol 𝔖(r). This pair qn, 𝔖(r) will be called the “configuration”: thus the configuration determines the possible behaviour of the machine. In some of the configurations in which the scanned square is blank (i.e. bears no symbol) the machine writes down a new symbol on the scanned square: in other configurations it erases the scanned symbol. The machine may also change the square which is being scanned, but only by shifting it one place to right or left. In addition to any of these operations the m-configuration may be changed. Some of the symbols written down will form the sequence of figures which is the decimal of the real number which is being computed. The others are just rough notes to “assist the memory.” It will only be these rough notes which will be liable to erasure.
我认为这些操作包括所有用于数字计算的操作。当读者熟悉机器理论时,对这一论点的辩护就会更容易。因此,在下一节中,我将继续理论的发展,并假设人们已经理解“机器”、“磁带”、“扫描”等的含义。
It is my contention that these operations include all those which are used in the computation of a number. The defence of this contention will be easier when the theory of the machines is familiar to the reader. In the next section I therefore proceed with the development of the theory and assume that it is understood what is meant by “machine,” “tape,” “scanned,” etc.
出于某些目的,我们可能会使用其运动仅部分由配置决定的机器(选择机器或c机器)(因此在第6.1节中使用“可能”一词)。当这样的机器达到这些不明确的配置之一时,它无法继续运行,直到外部操作员做出某种任意选择。如果我们使用机器来处理公理系统,情况就会如此。在本文中,我只讨论自动机器,因此经常省略前缀-。
For some purposes we might use machines (choice machines or c-machines) whose motion is only partially determined by the configuration (hence the use of the word “possible” in §6.1). When such a machine reaches one of these ambiguous configurations, it cannot go on until some arbitrary choice has been made by an external operator. This would be the case if we were using machines to deal with axiomatic systems. In this paper I deal only with automatic machines, and will therefore often omit the prefix a-.
在机器运动的任何阶段,扫描的方块的数量、带上所有符号的完整序列以及m配置将被认为描述了该阶段的完整配置。机器和磁带在连续的完整配置之间的变化称为机器的移动。
At any stage of the motion of the machine, the number of the scanned square, the complete sequence of all symbols on the tape, and the m-configuration will be said to describe the complete configuration at that stage. The changes of the machine and tape between successive complete configurations will be called the moves of the machine.
如果机器达到不可能移动的配置,或者如果它继续移动,并且可能打印第二类符号,但不能打印更多第一类符号,则机器将是圆形的。术语“循环”的意义将在第6.8节中解释。
A machine will be circular if it reaches a configuration from which there is no possible move, or if it goes on moving, and possibly printing symbols of the second kind, but cannot print any more symbols of the first kind. The significance of the term “circular” will be explained in §6.8.
我们将通过更多地谈论可计算序列而不是可计算数来避免混淆。
We shall avoid confusion by speaking more often of computable sequences than of computable numbers.
…… “R”的意思是“机器移动,以便扫描紧邻先前扫描的正方形右侧的正方形。“L”也是如此。“E”表示“扫描的符号被删除”,“P”表示“打印”。… [编辑:省略了编码和编程细节。]
… “R” means “the machine moves so that it scans the square immediately on the right of the one it was scanning previously. Similarly for “L.” “E” means “the scanned symbol is erased” and “P” stands for “prints.” … [EDITOR: encoding and programming details omitted.]
……让我们写下从机器表中形成的所有表达式,并用分号分隔它们。这样我们就获得了机器的完整描述。在本描述中,我们将q i替换为字母“D”,后跟重复i次的字母“A”,并将S j替换为“D”,后跟重复j次的“C”。这种新的机器描述可以称为标准描述(SD)。它完全由字母“A”、“C”、“D”、“L”、“R”、“N”和“;”组成。
… Let us write down all expressions so formed from the table for the machine and separate them by semi-colons. In this way we obtain a complete description of the machine. In this description we shall replace qi by the letter “D” followed by the letter “A” repeated i times, and Sj by “D” followed by “C” repeated j times. This new description of the machine may be called the standard description (S.D). It is made up entirely from the letters “A”, “C”, “D”, “L”, “R”, “N”, and from “;”.
如果最后我们将“A”替换为“1”,“C”替换为“2”,“D”替换为“3”,“L”替换为“4”,“R”替换为“5”,“N”替换为“6” “, 和 ”;” 通过“7”,我们将以阿拉伯数字的形式对机器进行描述。由该数字表示的整数可以称为机器的描述号(DN)。DN 唯一决定SD 和机器的结构。DN 为n的机器可以描述为ℳ ( n )。
If finally we replace “A” by “1”, “C” by “2”, “D” by “3”, “L” by “4”, “R” by “5”, “N” by “6”, and “;” by “7”, we shall have a description of the machine in the form of an arabic numeral. The integer represented by this numeral may be called a description number (D.N) of the machine. The D.N determine the S.D and the structure of the machine uniquely. The machine whose D.N is n may be described as ℳ(n).
每个可计算序列对应至少一个描述编号,而没有描述编号则对应多个可计算序列。因此,可计算的序列和数字是可枚举的。……
To each computable sequence there corresponds at least one description number, while to no description number does there correspond more than one computable sequence. The computable sequences and numbers are therefore enumerable. …
说明编号为 31332531173113353111731113322531111731111335317,3133253117311335311173111332253111173111133531731323253117 也是如此。
A description number is 31332531173113353111731113322531111731111335317, and so is 3133253117311335311173111332253111173111133531731323253117.
作为无圆机器的描述数字的数字将被称为令人满意的数字。第6.8节表明,不存在确定给定数字是否令人满意的通用过程。
A number which is a description number of a circle-free machine will be called a satisfactory number. In §6.8 it is shown that there can be no general process for determining whether a given number is satisfactory or not.
发明一台可用于计算任何可计算序列的机器是可能的。如果这台机器𝒰配备了磁带,磁带的开头写有某个计算机ℳ的SD ,那么𝒰将计算与ℳ相同的序列。……
It is possible to invent a single machine which can be used to compute any computable sequence. If this machine 𝒰 is supplied with a tape on the beginning of which is written the S.D of some computing machine ℳ, then 𝒰 will compute the same sequence as ℳ. …
人们可能会认为,证明实数不可枚举的论证也将证明可计算的数字和序列不可枚举(Hobson,1921,第 87-88 页)。例如,可以认为可计算数序列的极限必须是可计算的。显然,只有当可计算数字的序列由某种规则定义时,这才是正确的。
It may be thought that arguments which prove that the real numbers are not enumerable would also prove that the computable numbers and sequences cannot be enumerable (Hobson, 1921, pp. 87–88). It might, for instance, be thought that the limit of a sequence of computable numbers must be computable. This is clearly only true if the sequence of computable numbers is defined by some rule.
或者我们可以应用对角线过程。“如果可计算序列是可枚举的,则令a n为第n个可计算序列,并令phi n ( m ) 为a n中的第m个数字。令β为以 1 − phi n ( n ) 作为第 n个数字的序列。由于β是可计算的,因此存在一个数K,使得 1 − phi n ( n ) = phi K ( n ) all n。令n = K,我们有 1 = 2 phi K ( K ) ,即 1 是偶数。这是不可能的。因此,可计算的序列是不可枚举的。”
Or we might apply the diagonal process. “If the computable sequences are enumerable, let an be the nth computable sequence, and let ϕn(m) be the mth figure in an. Let β be the sequence with 1 −ϕn(n) as its nth figure. Since β is computable, there exists a number K such that 1 −ϕn(n) = ϕK(n) all n. Putting n = K, we have 1 = 2ϕK(K), i.e. 1 is even. This is impossible. The computable sequences are therefore not enumerable.”
这个论点的谬误在于β是可计算的假设。如果我们能够用有限的方法枚举可计算序列,那是对的,但是枚举可计算序列的问题相当于找出给定数字是否是无圈机器的 DN 的问题,并且我们没有通用的过程以有限数量的步骤完成此操作。事实上,通过正确应用对角过程论证,我们可以证明不可能存在任何这样的一般过程。
The fallacy in this argument lies in the assumption that β is computable. It would be true if we could enumerate the computable sequences by finite means, but the problem of enumerating computable sequences is equivalent to the problem of finding out whether a given number is the D.N of a circle-free machine, and we have no general process for doing this in a finite number of steps. In fact, by applying the diagonal process argument correctly, we can show that there cannot be any such general process.
最简单、最直接的证明是,如果这个一般过程存在,那么就存在一台计算β 的机器。这个证明虽然完美无缺,但也有一个缺点,那就是它可能会让读者产生“一定有问题”的感觉。我将给出的证明没有这个缺点,并且对“无环”概念的意义给出了一定的见解。它不依赖于构造β,而是依赖于构造β ′ ,其第 n个数字是phi n ( n ) 。
The simplest and most direct proof of this is by showing that, if this general process exists, then there is a machine which computes β. This proof, although perfectly sound, has the disadvantage that it may leave the reader with a feeling that “there must be something wrong.” The proof which I shall give has not this disadvantage, and gives a certain insight into the significance of the idea “circle-free.” It depends not on constructing β, but on constructing β′, whose nth figure is ϕn(n).
让我们假设有这样一个过程;也就是说,我们可以发明一台机器𝒟,当它配备任何计算机ℳ的SD时,它将测试这个SD,如果ℳ是圆形的,则用符号“ u ”标记SD ,如果它是无圆的将用“ s ”标记它。通过组合机器𝒟和𝒰,我们可以构建机器ℋ来计算序列β ′。……
Let us suppose that there is such a process; that is to say, that we can invent a machine 𝒟 which, when supplied with the S.D of any computing machine ℳ will test this S.D and if ℳ is circular will mark the S.D with the symbol “u” and if it is circle-free will mark it with “s.” By combining the machines 𝒟 and 𝒰 we could construct a machine ℋ to compute the sequence β′. …
机器ℋ的运动分为多个部分。在前N − 1 部分中,除其他外,整数 1, 2, … , N − 1 已被写下来并由机器𝒟进行测试。其中一定数量的R ( N − 1) 被发现是无圆机器的 D.N。在第N部分中,机器𝒟测试数字N。如果N满足,即如果它是无圆机器的DN,则R ( N )=1+ R ( N -1)并且DN为N的序列的前R ( N )个数字被计算。该序列的第R ( N )个数字被写为由ℋ计算的序列β ' 的数字之一。如果N不令人满意,则R ( N ) = R ( N − 1) 并且机器继续其运动的第( N + 1)部分。
The machine ℋ has its motion divided into sections. In the first N − 1 sections, among other things, the integers 1, 2, …, N − 1 have been written down and tested by the machine 𝒟. A certain number, say R(N − 1), of them have been found to be the D.N’s of circle-free machines. In the Nth section the machine 𝒟 tests the number N. If N is satisfactory, i.e., if it is the D.N of a circle-free machine, then R(N) = 1 + R(N − 1) and the first R(N) figures of the sequence of which a D.N is N are calculated. The R(N)th figure of this sequence is written down as one of the figures of the sequence β′ computed by ℋ. If N is not satisfactory, then R(N) = R(N − 1) and the machine goes on to the (N + 1)th section of its motion.
从ℋ的构造我们可以看出ℋ是无圆的。ℋ运动的每一段都会在有限步数之后结束。因为,根据我们对𝒟 的假设, N是否令人满意的决定是在有限数量的步骤中做出的。如果N不令人满意,则第N部分完成。如果N满足,这意味着DN为N的机器ℳ ( N )是无环的,因此它的第R ( N )个数字可以在有限的步骤中计算出来。当这个数字被计算并写下为β ' 的第R ( N )个数字时,第N部分就完成了。因此ℋ是无环的。
From the construction of ℋ we can see that ℋ is circle-free. Each section of the motion of ℋ comes to an end after a finite number of steps. For, by our assumption about 𝒟, the decision as to whether N is satisfactory is reached in a finite number of steps. If N is not satisfactory, then the Nth section is finished. If N is satisfactory, this means that the machine ℳ(N) whose D.N is N is circle-free, and therefore its R(N)th figure can be calculated in a finite number of steps. When this figure has been calculated and written down as the R(N)th figure of β′ the Nth section is finished. Hence ℋ is circle-free.
现在让K为ℋ的 DN 。ℋ在其运动的第K部分做了什么?它必须测试K是否令人满意,给出结论“ s ”或“ u ”。由于K是ℋ的 DN ,并且ℋ是无环的,因此结论不能是“ u ”。另一方面,判决不能是“ s ”。因为如果是的话,那么在其运动的第K个部分中ℋ将必然要计算由机器以K作为其 DN 计算的序列的前R ( K − 1) + 1 = R ( K ) 个数字,并且将R ( K ) th写为由ℋ计算的序列的数字。前R ( K ) − 1 个数字的计算可以正常进行,但计算R ( K ) th的指令相当于“计算ℋ计算出的前R ( K ) 个数字,并记下R ( K ) th。” 这个第 R ( K )个图形永远不会被发现。即,ℋ是循环的,与我们在上一段中发现的内容以及结论“ s ”相反。因此,这两个结论都是不可能的,我们的结论是不可能有机器𝒟。
Now let K be the D.N of ℋ. What does ℋ do in the Kth section of its motion? It must test whether K is satisfactory, giving a verdict “s” or “u.” Since K is the D.N of ℋ and since ℋ is circle-free, the verdict cannot be “u.” On the other hand the verdict cannot be “s.” For if it were, then in the Kth section of its motion ℋ would be bound to compute the first R(K − 1) + 1 = R(K) figures of the sequence computed by the machine with K as its D.N and to write down the R(K)th as a figure of the sequence computed by ℋ. The computation of the first R(K) − 1 figures would be carried out all right, but the instructions for calculating the R(K)th would amount to “calculate the first R(K) figures computed by ℋ and write down the R(K)th.” This R(K)th figure would never be found. I.e., ℋ is circular, contrary both to what we have found in the last paragraph and to the verdict “s.” Thus both verdicts are impossible and we conclude that there can be no machine 𝒟.
我们可以进一步证明,不存在任何机器 ℰ ,当提供任意机器 ℳ的 SD 时 ,将确定ℳ是否打印给定的符号(例如 0)。……
We can show further that there can be no machine ℰ which, when supplied with the S.D of an arbitrary machine ℳ, will determine whether ℳ ever prints a given symbol (0 say). …
尚未尝试表明“可计算”数字包括自然被视为可计算的所有数字。所有可以给出的论证从根本上来说必然诉诸直觉,因此在数学上相当不令人满意。真正有争议的问题是“计算数字时可以执行哪些可能的过程?”
No attempt has yet been made to show that the “computable” numbers include all numbers which would naturally be regarded as computable. All arguments which can be given are bound to be, fundamentally, appeals to intuition, and for this reason rather unsatisfactory mathematically. The real question at issue is “What are the possible processes which can be carried out in computing a number?”
我将使用的论据分为三种。
The arguments which I shall use are of three kinds.
(a) 直接诉诸直觉。
(a) A direct appeal to intuition.
(b) 两个定义的等价性证明(如果新定义具有更大的直观吸引力)。
(b) A proof of the equivalence of two definitions (in case the new definition has a greater intuitive appeal).
(c) 给出可计算的大类数字的例子。
(c) Giving examples of large classes of numbers which are computable.
一旦承认可计算数都是“可计算的”,其他几个具有相同特征的命题就会随之而来。特别地,如果存在确定希尔伯特函数演算的公式是否可证明的通用过程,则该确定可以由机器来执行。
Once it is granted that computable numbers are all “computable,” several other propositions of the same character follow. In particular, it follows that, if there is a general process for determining whether a formula of the Hilbert function calculus is provable, then the determination can be carried out by a machine.
I. [类型(a)]。这个论点只是对§ 6.1的思想的阐述。
I. [Type (a)]. This argument is only an elaboration of the ideas of §6.1.
计算通常是通过在纸上写下某些符号来完成的。我们可以假设这张纸像孩子的算术书一样被分成几个正方形。在初等算术中,有时会使用纸张的二维特征。但这样的使用总是可以避免的,而且我认为人们会同意纸张的二维特性对于计算来说并不是必需的。我假设计算是在一维纸上进行的,即在分成正方形的带子上进行。我还假设可以打印的符号数量是有限的。如果我们允许无限的符号,那么就会有任意小的符号差异。这种符号数量限制的影响并不是很严重。总是可以使用符号序列来代替单个符号。因此,诸如 17 或 999999999999999 之类的阿拉伯数字通常被视为单个符号。类似地,在任何欧洲语言中,单词都被视为单个符号(然而,中文试图拥有可枚举的无限符号)。从我们的角度来看,单一符号和复合符号的区别在于,复合符号如果太长,一眼就看不出来。这是根据经验得出的。我们无法一眼看出 9999999999999999 和 999999999999999 是否相同。
Computing is normally done by writing certain symbols on paper. We may suppose this paper is divided into squares like a child’s arithmetic book. In elementary arithmetic the two-dimensional character of the paper is sometimes used. But such a use is always avoidable, and I think that it will be agreed that the two-dimensional character of paper is no essential of computation. I assume then that the computation is carried out on one-dimensional paper, i.e. on a tape divided into squares. I shall also suppose that the number of symbols which may be printed is finite. If we were to allow an infinity of symbols, then there would be symbols differing to an arbitrarily small extent. The effect of this restriction of the number of symbols is not very serious. It is always possible to use sequences of symbols in the place of single symbols. Thus an Arabic numeral such as 17 or 999999999999999 is normally treated as a single symbol. Similarly in any European language words are treated as single symbols (Chinese, however, attempts to have an enumerable infinity of symbols). The differences from our point of view between the single and compound symbols is that the compound symbols, if they are too lengthy, cannot be observed at one glance. This is in accordance with experience. We cannot tell at a glance whether 9999999999999999 and 999999999999999 are the same.
计算机在任何时刻的行为都是由他正在观察的符号以及他当时的“心态”决定的。[编辑:这里的“计算机”是一个正在执行计算的人。]我们可以假设计算机在某一时刻可以观察到的符号或方块的数量有一个界限B。如果他想观察更多,他必须使用连续的观察。我们还将假设需要考虑的心理状态的数量是有限的。其原因与限制符号数量的原因相同。如果我们承认无限的心态,其中一些就会“任意接近”并且会感到困惑。同样,这一限制不会严重影响计算,因为可以通过在磁带上写入更多符号来避免使用更复杂的思维状态。
The behaviour of the computer at any moment is determined by the symbols which he is observing, and his “state of mind” at that moment. [EDITOR: Here a “computer” is a person who is performing computations.] We may suppose that there is a bound B to the number of symbols or squares which the computer can observe at one moment. If he wishes to observe more, he must use successive observations. We will also suppose that the number of states of mind which need be taken into account is finite. The reasons for this are of the same character as those which restrict the number of symbols. If we admitted an infinity of states of mind, some of them will be “arbitrarily close” and will be confused. Again, the restriction is not one which seriously affects computation, since the use of more complicated states of mind can be avoided by writing more symbols on the tape.
让我们想象一下计算机执行的操作被分成“简单操作”,这些操作是如此基本,以至于很难想象它们进一步划分。每一个这样的操作都包括对由计算机和磁带组成的物理系统进行一些更改。如果我们知道磁带上的符号序列(其中哪些符号是由计算机观察到的(可能具有特殊顺序))以及计算机的思维状态,那么我们就知道系统的状态。我们可以假设在一项简单的操作中不会改变超过一个符号。任何其他更改都可以分解为此类简单更改。关于其符号可以以这种方式改变的方格的情况与关于观察到的方格的情况相同。因此,在不失一般性的情况下,我们可以假设符号发生变化的方格始终是“观察到的”方格。
Let us imagine the operations performed by the computer to be split up into “simple operations” which are so elementary that it is not easy to imagine them further divided. Every such operation consists of some change of the physical system consisting of the computer and his tape. We know the state of the system if we know the sequence of symbols on the tape, which of these are observed by the computer (possibly with a special order), and the state of mind of the computer. We may suppose that in a simple operation not more than one symbol is altered. Any other changes can be split up into simple changes of this kind. The situation in regard to the squares whose symbols may be altered in this way is the same as in regard to the observed squares. We may, therefore, without loss of generality, assume that the squares whose symbols are changed are always “observed” squares.
除了这些符号的变化之外,简单的运算还必须包括观察到的方块分布的变化。新观察到的方块必须能够立即被计算机识别。我认为可以合理地假设它们只能是与之前观察到的最近的正方形的距离不超过某个固定量的正方形。假设每个新观察到的方块都位于之前观察到的方块的L个方块内。
Besides these changes of symbols, the simple operations must include changes of distribution of observed squares. The new observed squares must be immediately recognisable by the computer. I think it is reasonable to suppose that they can only be squares whose distance from the closest of the immediately previously observed squares does not exceed a certain fixed amount. Let us say that each of the new observed squares is within L squares of an immediately previously observed square.
与“立即可识别性”相关,可能会认为还有其他类型的可立即识别的正方形。特别是,用特殊符号标记的方块可能被认为是可立即识别的。现在,如果这些方块仅由单个符号标记,则它们的数量只能是有限的,并且我们不应该通过将这些标记的方块与观察到的方块相连来扰乱我们的理论。另一方面,如果它们由一系列符号标记,我们就不能将识别过程视为一个简单的过程。这是一个基本点,应该加以说明。在大多数数学论文中,方程和定理都有编号。通常这些数字不会超过(比如说)1000。因此,可以通过数字一眼就认出一个定理。但如果论文很长,我们可能会得出定理157767733443477;然后,在本文中,我们可能会发现“ ……因此(应用定理 157767733443477)我们有…… ”。为了确定哪个是相关定理,我们应该逐个比较这两个数字,可能会用铅笔在数字上打勾,以确保它们不会被计算两次。如果尽管如此,仍然认为存在其他“立即可识别”的方块,那么只要这些方块可以通过我的机器类型能够执行的某种过程找到,就不会打乱我的论点。……
In connection with “immediate recognisability,” it may be thought that there are other kinds of square which are immediately recognisable. In particular, squares marked by special symbols might be taken as immediately recognisable. Now if these squares are marked only by single symbols there can be only a finite number of them, and we should not upset our theory by adjoining these marked squares to the observed squares. If, on the other hand, they are marked by a sequence of symbols, we cannot regard the process of recognition as a simple process. This is a fundamental point and should be illustrated. In most mathematical papers the equations and theorems are numbered. Normally the numbers do not go beyond (say) 1000. It is, therefore, possible to recognise a theorem at a glance by its number. But if the paper was very long, we might reach Theorem 157767733443477; then, further on in the paper, we might find “… hence (applying Theorem 157767733443477) we have ….” In order to make sure which was the relevant theorem we should have to compare the two numbers figure by figure, possibly ticking the figures off in pencil to make sure of their not being counted twice. If in spite of this it is still thought that there are other “immediately recognisable” squares, it does not upset my contention so long as these squares can be found by some process of which my type of machine is capable. …
因此,简单的操作必须包括:
The simple operations must therefore include:
(a) 观察到的方块之一上的符号发生变化。
(a) Changes of the symbol on one of the observed squares.
(b) 在先前观察到的一个方格的L个方格内观察到的一个方格到另一个方格的变化。
(b) Changes of one of the squares observed to another square within L squares of one of the previously observed squares.
其中一些变化可能必然涉及心态的变化。因此,最通用的单一操作必须被视为以下操作之一:
It may be that some of these changes necessarily involve a change of state of mind. The most general single operation must therefore be taken to be one of the following:
(A) 符号的可能变化 (a) 以及心态的可能变化。
(A) A possible change (a) of symbol together with a possible change of state of mind.
(B) 观察到的方格可能发生 (b) 的变化,以及心态可能发生的变化。
(B) A possible change (b) of observed squares, together with a possible change of state of mind.
正如所暗示的那样,实际执行的操作是由计算机的思维状态和观察到的符号决定的。特别是,它们决定了计算机在执行操作后的心理状态。
The operation actually performed is determined, as has been suggested …, by the state of mind of the computer and the observed symbols. In particular, they determine the state of mind of the computer after the operation is carried out.
我们现在可以构建一台机器来完成这台计算机的工作。计算机的每种心理状态都对应于机器的“ m配置”。机器扫描与计算机观察到的B方格相对应的B方格。在任何移动中,机器都可以改变扫描的方块上的符号,或者可以将任何一个扫描的方块改变为与其他扫描的方块之一距离不超过L个方块的另一个方块。所完成的移动以及随后的配置由扫描的符号和m配置确定。刚刚描述的机器与第6.2节中定义的计算机没有本质上的区别,并且对应于这种类型的任何机器,可以构造计算机来计算相同的序列,即由计算机计算的序列。……
We may now construct a machine to do the work of this computer. To each state of mind of the computer corresponds an “m-configuration” of the machine. The machine scans B squares corresponding to the B squares observed by the computer. In any move the machine can change a symbol on a scanned square or can change any one of the scanned squares to another square distant not more than L squares from one of the other scanned squares. The move which is done, and the succeeding configuration, are determined by the scanned symbol and the m-configuration. The machines just described do not differ very essentially from computing machines as defined in §6.2, and corresponding to any machine of this type a computing machine can be constructed to compute the same sequence, that is to say the sequence computed by the computer. …
§ 6.8的结果有一些重要的应用。特别是,它们可以用来证明希尔伯特问题没有解。……
The results of §6.8 have some important applications. In particular, they can be used to show that the Hilbert Entscheidungsproblem can have no solution. …
因此,我建议证明,不存在通用过程来确定泛函微积分K的给定公式𝔄是否可证明,即,不可能有任何机器在提供这些公式中的任何一个𝔄时,最终会说:𝔄是否可证明。
I propose, therefore, to show that there can be no general process for determining whether a given formula 𝔄 of the functional calculus K is provable, i.e. that there can be no machine which, supplied with any one 𝔄 of these formulae, will eventually say whether 𝔄 is provable.
也许应该指出的是,我将证明的与哥德尔众所周知的结果有很大不同。哥德尔已经证明(在《数学原理》的形式主义中)存在一些命题𝔄,使得𝔄和 − 𝔄都不可证明。结果表明,在该形式主义中无法给出数学原理(或K )的一致性证明。另一方面,我将证明,没有通用方法可以说明给定公式𝔄在K中是否可证明,或者,同样的结果,由K和作为额外公理邻接的− 𝔄组成的系统是否一致。
It should perhaps be remarked that what I shall prove is quite different from the well-known results of Gödel. Gödel has shown that (in the formalism of Principia Mathematica) there are propositions 𝔄 such that neither 𝔄 nor − 𝔄 is provable. As a consequence of this, it is shown that no proof of consistency of Principia Mathematica (or of K) can be given within that formalism. On the other hand, I shall show that there is no general method which tells whether a given formula 𝔄 is provable in K, or, what comes to the same, whether the system consisting of K with − 𝔄 adjoined as an extra axiom is consistent.
如果哥德尔所证明的否定已经被证明,即如果对于每个𝔄,𝔄或 - 𝔄是可证明的,那么我们应该立即得到 Entscheidungs 问题的解。因为我们可以发明一台机器𝒦,它将连续证明所有可证明的公式。迟早𝒦会达到𝔄或 − 𝔄。如果它达到𝔄,那么我们就知道𝔄是可证明的。如果它达到 − 𝔄,那么,由于K是一致的(Hilbert 和 Ackermann,第 65 页),我们知道𝔄是不可证明的。
If the negation of what Gödel has shown had been proved, i.e. if, for each 𝔄, either 𝔄 or − 𝔄 is provable, then we should have an immediate solution of the Entscheidungsproblem. For we can invent a machine 𝒦 which will prove consecutively all provable formulae. Sooner or later 𝒦 will reach either 𝔄 or − 𝔄. If it reaches 𝔄, then we know that 𝔄 is provable. If it reaches − 𝔄, then, since K is consistent (Hilbert and Ackermann, p. 65), we know that 𝔄 is not provable.
由于K中没有整数,证明显得有些冗长。基本思想非常简单。
Owing to the absence of integers in K the proofs appear somewhat lengthy. The underlying ideas are quite straightforward.
对应于每台计算机ℳ,我们构造一个公式 Un( ℳ ),并证明,如果存在一个通用方法来确定 Un( ℳ ) 是否可证明,那么也存在一个通用方法来确定ℳ是否打印过 0。
Corresponding to each computing machine ℳ we construct a formula Un(ℳ) and we show that, if there is a general method for determining whether Un(ℳ) is provable, then there is a general method for determining whether ℳ ever prints 0.
涉及的命题函数解释如下:
The interpretations of the propositional functions involved are as follows:
R S ( x, y ) 被解释为“在完整的配置x ( ℳ ) 中,正方形y上的符号是S ”。I ( x, y ) 被解释为“在完整的配置x中,扫描正方形y ”。K q m ( x ) 被解释为“在完整配置x中,m配置是q m ”。F ( x, y ) 被解释为“ y是x的直接后继”。……
RS(x, y) is to be interpreted as “in the complete configuration x (of ℳ) the symbol on the square y is S.” I(x, y) is to be interpreted as “in the complete configuration x the square y is scanned.” Kqm(x) is to be interpreted as “in the complete configuration x the m-configuration is qm.” F(x, y) is to be interpreted as “y is the immediate successor of x.” …
经伦敦数学会许可,转载自图灵 (1936)。
Reprinted from Turing (1936), with permission from the London Mathematical Society.
进入二十世纪,关于数学函数值(例如十位对数)的书籍出现在每一位物理科学家和工程师的办公桌上。计算尺是有用的辅助工具,但精度有限。机电台式计算器是商业和科学实践的重要机器,但其操作员(大多是被称为“计算机”的女性)的工作极其乏味。
Well into the twentieth century, books of the values of mathematical functions, ten-place logarithms for example, were on the desk of every practicing physical scientist and engineer. Slide rules were useful aids but of limited precision. Electromechanical desk calculators were important machines for the practice of both business and science, but the work of their human operators (mostly women known as “computers”) was tedious in the extreme.
霍华德·海瑟薇·艾肯(Howard Hathaway Aiken,1900-1973 年)是哈佛大学物理学教授,曾获得美国海军预备役司令军衔。33 岁时,他在西屋电气和其他电力行业公司担任工程师,之后成为哈佛大学的研究生。在研究论文所需方程的近似解的过程中,艾肯厌倦了使用可用的机械计算器和数值表进行计算。巴贝奇的齿轮和轮子引起了他的注意并成为他的灵感,尽管只是在他开始设计自己的自动计算器之后(Cohen,1999,第 67 页)。他似乎不知道巴贝奇设计的细节以及洛夫莱斯夫人对其如何编程的解释,也没有证据表明他了解艾伦·图灵 1936 年的开创性数学工作或图灵在第二次世界大战期间设计的密码机器。二.
Howard Hathaway Aiken (1900–1973) was a physics professor at Harvard who attained the rank of Commander in the U.S. Naval Reserve. He became a graduate student at Harvard at the age of 33 after working as an engineer for Westinghouse and other companies in the electric industries. In the course of grinding out approximate solutions to equations he needed for his thesis, Aiken tired of doing calculations with the available mechanical calculators and numerical tables. Babbage’s gears and wheels came to his attention and served as an inspiration, though only after he had begun to design his own automatic calculator (Cohen, 1999, p. 67). He seems to have been unaware of the details of Babbage’s design and of Lady Lovelace’s explanation of how it would be programmed, and there is no evidence that he knew anything of Alan Turing’s groundbreaking mathematical work of 1936 or the cryptologic machinery Turing designed during World War II.
这一选择是艾肯为建造有史以来最大的机电机器而提交的工业支持提案。自动序列控制计算器,后来被称为 Mark I,是在 IBM 的帮助下开发的,并于 1944 年投入使用。它的一部分至今仍陈列在哈佛。最初长约 50 英尺,高 8 英尺,深 3 英尺,重近 5 吨,它是一个奇迹,有 530 英里的电线、数千个用于开关的电磁继电器、用于保存数字常数的十进制刻度盘以及用于中间存储的十进制计数器。输入内容是在卡片纸上打孔的;输出是用电动打字机打印的。几十年来,我几乎每天都会经过这台机器。停下来并检查它,就等于对一系列进化死胡同感到困惑,从数字系统到接线。这些部件通过坚固的钟形线连接,绝缘层为未褪色的黄色、蓝色和红色。但是,Mark I 中的所有黄色电线、所有蓝色电线和所有红色电线都捆绑在一起,而不是像现代带状电缆那样将不同颜色的电线捆绑在一起,以便同一根电线的末端可以轻松匹配。这些包裹很大,有几英寸厚,今天很难想象颜色变化的用途。
This selection is the proposal Aiken submitted for industrial support in constructing the largest electromechanical machine that had ever been built. The Automatic Sequence Controlled Calculator, later dubbed the Mark I, was developed with the assistance of IBM and put into service in 1944. Part of it remains on display at Harvard. Originally some 50 feet long, 8 feet high, and 3 feet deep, weighing nearly 5 tons, it was a marvel of 530 miles of wire, thousands of electromagnetic relays for switches, decimal dials to hold numerical constants, and decimal counters for intermediate storage. Input was punched on card stock; output was printed on an electric typewriter. For decades I walked past this machine almost daily. To pause and examine it is to puzzle over a set of evolutionary dead ends, from the number system to the wiring. The parts are connected by sturdy bell wire, with insulation in unfaded yellow, blue, and red. But rather than bundling different colors together as in modern ribbon cables, so the ends of the same wire could be matched up easily, all the yellow wires in the Mark I are bundled together, as are all the blue and all the red. The bundles are massive, several inches thick, and it is hard today to imagine what purpose the color variation was supposed to serve.
最重要的是,Mark I 的程序被打孔到由 IBM 卡片所用纸料制成的胶带环中。因此,该机器是为重复操作而设计的,例如对一系列项求和,但无法执行递归算法、嵌套循环,甚至无法执行最初构建的条件分支。即使在艾肯后来的机器(Mark IV 是最后一台)中,也没有存储程序的概念。因此,虽然 Mark I 确实是可编程的——从某种意义上说,一些研究人员可以准备新程序,而其他人则使用机器做有用的工作——但它不具备我们所知道的软件。艾肯对后来被称为“哈佛架构”(数据和程序存储在不同类型的存储中)的承诺使艾肯的机器陷入了智力的歧途。这一行动扩展到了其他大学和企业,艾肯在 60 岁时从哈佛退休。
Most importantly, the Mark I’s program was punched into a loop of tape made from the paper stock used for IBM cards. So the machine was designed for repetitive operations, such as summing the terms of a series, but was incapable of executing a recursive algorithm, nested loop, or even, as originally built, a conditional branch. Even in Aiken’s later machines (the Mark IV was the last), there was no concept of a stored program. So while the Mark I was certainly programmable—in the sense that some researchers could prepare new programs while others were using the machine to do useful work—it enjoyed nothing of what we know as software. Aiken’s commitment to what came to be known as the “Harvard architecture”—data and programs in separate kinds of storage—left Aiken’s machines in an intellectual sidetrack. The action moved to other universities and to corporations, and Aiken retired from Harvard at the age of 60.
Mark I 噪音大且笨重,但它确实有效。23位小数的加减法耗时0.3秒;乘法,最多 6 秒;分度,最多 15.6 秒;log x、e x或 sin x,一分钟或更长时间。这些速度足以对十几个联立线性方程进行数值求解,哈佛大学经济学家 Wassily Leontief 在发展他的输入输出理论时使用了该机器。艾肯笑到了最后。如今大多数工作计算机都采用嵌入式系统,它们的程序冻结在固件中,不会在机器运行时意外更改,就像 Mark I 的磁带循环一样。
The Mark I was noisy and clunky, but it worked. Addition or subtraction of 23 decimal digits took 0.3 sec.; multiplication, up to 6 sec.; division, up to 15.6 sec.; log x, ex, or sin x, a minute or more. These speeds were sufficient for the numerical solution of a dozen simultaneous linear equations, an application for which Harvard economist Wassily Leontief used the machine while developing his input–output theory. And Aiken got the last laugh. Most working computers today are in embedded systems, their programs frozen in firmware that cannot accidentally be altered as the machine is running, just like the Mark I’s tape loop.
在算术计算中节省时间和脑力以及消除人类容易出错的愿望可能与算术科学本身一样古老。这种愿望导致了各种计算辅助工具的设计和构造,从一组小物体开始,例如卵石,首先松散地使用,后来作为格板上的计数器,后来仍然作为安装在固定在一个盒子中的电线上的珠子。框架,如算盘。这种乐器很可能是由闪族人发明的,后来传入印度,并向西传播到整个欧洲,向东传播到中国和日本。
THE desire to economize time and mental effort in arithmetical computations, and to eliminate human liability to error, is probably as old as the science of arithmetic itself. This desire has led to the design and construction of a variety of aids to calculation, beginning with groups of small objects, such as pebbles, first used loosely, later as counters on ruled boards, and later still as beads mounted on wires fixed in a frame, as in the abacus. This instrument was probably invented by the Semitic races and later adopted in India, whence it spread westward throughout Europe and eastward to China and Japan.
算盘发展后,没有取得进一步的进展,直到 1617 年约翰·纳皮尔 (John Napier) 设计了他的算数棒,或称为“纳皮尔骨”。出现了各种形式的算盘,其中一些接近机械计算的开端,但直到 1642 年布莱斯才发现算盘。帕斯卡为我们带来了第一台机械计算机,这个术语在今天使用的意义上是这样的。他的机器的应用仅限于加法和减法,但在 1666 年,塞缪尔·莫兰 (Samuel Morland) 将其改编为通过重复加法进行乘法。[编辑:莫兰 (1673) 中描述]
After the development of the abacus, no further advances were made until John Napier devised his numbering rods, or Napier’s Bones, in 1617. Various forms of the Bones appeared, some approaching the beginning of mechanical computation, but it was not until 1642 that Blaise Pascal gave us the first mechanical calculating machine in the sense that the term is used today. The application of his machine was restricted to addition and subtraction, but in 1666 Samuel Morland adapted it to multiplication by repeated additions. [EDITOR: Described in Morland (1673)]
下一个进步是莱布尼茨取得的,他在 1671 年构思了一台乘法机,并于 1694 年完成了它的建造。 [编辑:事实上,它在 1674 年就已经开始工作,并且今天仍然有效。] 在设计这台机器的过程中,莱布尼茨发明了两项重要的技术:今天仍然作为现代计算机组件出现的设备:步进计算器和针轮。
The next advance was made by Leibniz who conceived a multiplying machine in 1671 and finished its construction in 1694. [EDITOR: In fact, it was working by 1674 and still works today.] In the process of designing this machine Leibniz invented two important devices which still occur as components of modern calculating machines today: the stepped reckoner, and the pin wheel.
与此同时,继纳皮尔发明对数之后,奥特雷德、约翰·布朗、科格歇尔、埃弗拉德等人正在开发计算尺。由于其成本低且容易计算尺早在 1700 年就得到了科学家的广泛认可。直到今天,计算尺仍在继续进一步发展,越来越多地应用于解决要求精度不超过三到四位有效数字的科学问题,当计算总量不太大时。特别是在工程设计中,计算尺被证明是一种非常宝贵的工具。
Meanwhile, following the invention of logarithms by Napier, the slide rule was being developed by Oughtred, John Brown, Coggeshall, Everard, and others. Owing to its low cost and ease of construction, the slide rule received wide recognition from scientific men as early as 1700. Further development has continued up to the present time, with ever increasing application to the solution of scientific problems requiring an accuracy of not more than three or four significant figures, and when the total bulk of the computation is not too great. Particularly in engineering design has the slide rule proved to be an invaluable instrument.
尽管计算尺被广泛接受,但它从未阻碍更精确的机械计算方法的发展。因此,我们发现一些有史以来最伟大的数学家和物理学家的名字与计算机器的发展有关。很自然,为了设计科学进步的手段,这些人主要从自己的角度考虑机械计算。一个值得注意的例外是帕斯卡,他发明了计算器,目的是帮助他的父亲进行金钱计算。尽管科学兴趣广泛,但现代计算机器的发展进展缓慢,直到商业企业的发展和会计日益复杂使得机械计算成为经济必需品。因此,物理学家和数学家的想法,他们预见到了可能性并给出了基础,已经转向了极好的目的,但与他们最初的目的有很大不同。
Though the slide rule was widely accepted, at no time, however, did it act as a deterrent to the development of the more precise methods of mechanical computation. Thus we find the names of some of the greatest mathematicians and physicists of all time associated with the development of calculating machinery. Naturally enough, in an effort to devise means of scientific advancement, these men considered mechanical calculation largely from their own point of view. A notable exception was Pascal who invented his calculating machine for the purpose of assisting his father in computations with sums of money. Despite this widespread scientific interest, the development of modern calculating machinery proceeded slowly until the growth of commercial enterprises and the increasing complexity of accounting made mechanical computation an economic necessity. Thus the ideas of the physicists and mathematicians, who foresaw the possibilities and gave the fundamentals, have been turned to excellent purposes, but differing greatly from those for which they were originally intended.
很少有计算机是严格为科学研究而设计的,值得注意的例外是查尔斯·巴贝奇和其他追随他的人的计算机。1812 年,巴贝奇提出了一种比以前建造的计算机更高级的计算机的想法,用于计算和打印数学函数表。这台机器通过差分方法工作,被称为差分机。巴贝奇的第一个模型于 1822 年制成,并于 1823 年在英国政府的资助下开始建造该机器。该工程一直持续到 1833 年,国家援助在花费近20,000英镑后被撤回。目前该机器收藏于南肯辛顿科学博物馆。……
Few calculating machines have been designed strictly for application to scientific investigations, the notable exceptions being those of Charles Babbage and others who followed him. In 1812 Babbage conceived the idea of a calculating machine of a higher type than those previously constructed, to be used for calculating and printing tables of mathematical functions. This machine worked by the method of differences, and was known as a difference engine. Babbage’s first model was made in 1822, and in 1823 the construction of the machine was begun with the aid of a grant from the British Government. The construction was continued until 1833 when state aid was withdrawn after an expenditure of nearly £20 000. At present the machine is in the collection of the Science Museum, South Kensington. …
自巴贝奇时代以来,计算机器的发展一直在加速发展。专为加、减、乘、除等单一算术运算而设计的按键驱动计算器已达到高度完美。然而,在大型商业企业中,会计工作量如此之大,以至于这些机器已经无法满足需要。
Since the time of Babbage, the development of calculating machinery has continued at an increasing rate. Key-driven calculators designed for single arithmetical operations such as addition, subtraction, multiplication, and division, have been brought to a high degree of perfection. In large commercial enterprises, however, the volume of accounting work is so great that these machines are no longer adequate in scope.
因此,霍勒里斯又回到了巴贝奇首次在计算机器中使用的打孔卡,并为制表、计数、排序和算术机器(例如现在在工业中广泛使用的机器)的发展奠定了基础。电气设备和技术的发展在国际商业机器公司制造的这些机器中得到了应用,直到今天,巴贝奇希望完成的许多事情每天都在世界各地工业企业的会计办公室中完成。
Hollerith, therefore, returned to the punched card first employed in calculating machinery by Babbage and with it laid the groundwork for the development of tabulating, counting, sorting, and arithmetical machinery such as is now widely utilized in industry. The development of electrical apparatus and technique found application in these machines as manufactured by the International Business Machines Company, until today many of the things Babbage wished to accomplish are being done daily in the accounting offices of industrial enterprises all over the world.
如前所述,这些机器都是针对会计的特殊应用而设计的。在每种情况下,他们都关心算术的四种基本运算,而不是代数性质的运算。然而,它们的存在使得构建专门为数学科学目的而设计的自动计算机成为可能。
As previously stated, these machines are all designed with a view to special applications to accounting. In every case they are concerned with the four fundamental operations of arithmetic, and not with operations of algebraic character. Their existence, however, makes possible the construction of an automatic calculating machine specially designed for the purposes of the mathematical sciences.
已经表明,从科学诞生之日起,人们就感受到了计算中对机械辅助的需求,但目前这种需求比以往任何时候都更加强烈。近年来数学和物理科学的深入发展包括许多新的有用函数的定义,几乎所有这些函数都是由无穷级数或其他无穷过程定义的。其中大多数都没有充分列出,因此阻碍了它们在科学问题上的应用。
It has already been indicated that the need for mechanical assistance in computation has been felt from the beginning of science, but at present this need is greater than ever before. The intensive development of the mathematical and physical sciences in recent years has included the definition of many new and useful functions, nearly all of which are defined by infinite series or other infinite processes. Most of these are inadequately tabulated and their application to scientific problems is thereby retarded.
物理测量精度的提高使得物理理论中的计算变得更加精确,经验表明,计算的理论结果和实验结果之间的微小差异可能会导致新的物理效应的发现,有时具有最大的科学和工业重要性。
The increased accuracy of physical measurement has made necessary more accurate computation in physical theory, and experience has shown that small differences between computed theoretical and experimental results may lead to the discovery of a new physical effect, sometimes of the greatest scientific and industrial importance.
许多最新的科学发展,包括热电子真空管等设备,都是基于非线性效应。通常,用来表示这些物理效应的微分方程与之前研究过的任何形式都不对应,因此无法用于积分的所有方法。在这种情况下唯一可用的解决方法是无限级数展开和数值积分。这两种方法都涉及大量的计算工作。
Many of the most recent scientific developments, including such devices as the thermionic vacuum tube, are based on nonlinear effects. Only too often the differential equations designed to represent these physical effects correspond to no previously studied forms, and thus defy all methods available for their integration. The only methods of solution available in such cases are expansions in infinite series and numerical integration. Both these methods involve enormous amounts of computational labor.
当前理论物理学通过波力学的发展完全基于数学概念,并清楚地表明物理科学的未来取决于实验指导的数学推理。目前存在一些我们无法解决的问题,不是因为理论困难,而是因为机械计算手段不够。
The present development of theoretical physics through wave mechanics is based entirely on mathematical concepts and clearly indicates that the future of the physical sciences rests in mathematical reasoning directed by experiment. At present there exist problems beyond our ability to solve, not because of theoretical difficulties, but because of insufficient means of mechanical computation.
在物理科学的某些研究领域中,例如在电离层的研究中,表示现象所需的数学表达式太长且复杂,无法在印刷页上写成几行,但对此类现象的数值研究表达式对于我们研究高层大气物理是绝对必要的,无线电通信和电视的未来就依赖于此类研究。
In some fields of investigation in the physical sciences as, for instance, in the study of the ionosphere, the mathematical expressions required to represent the phenomena are too long and complicated to write in several lines across a printed page, yet the numerical investigation of such expressions is an absolute necessity to our study of the physics of the upper atmosphere, and on this type of research rests the future of radio communication and television.
这些只是物理和数学科学所面临的计算困难的几个例子,除此之外还可以添加来自天文学、相对论、甚至快速发展的数学经济学的许多其他困难。所有这些计算困难都可以通过设计合适的自动计算机器来消除。
These are but a few examples of the computational difficulties with which the physical and mathematical sciences are faced, and to these may be added many others taken from astronomy, the theory of relativity, and even the rapidly growing science of mathematical economy. All these computational difficulties can be removed by the design of suitable automatic calculating machinery.
专门为快速解决科学问题而设计的计算机器具有以下功能,而在为会计目的而制造的计算机器中则没有这些功能:
The features to be incorporated in calculating machinery specially designed for rapid work on scientific problems, and not to be found in calculating machines as manufactured for accounting purposes, are the following:
1. 普通的会计机器几乎完全涉及正数问题,而为数学目的而设计的机器必须能够处理正数和负数。
1. Ordinary accounting machines are concerned almost entirely with problems of positive numbers, while machines designed for mathematical purposes must be able to handle both positive and negative quantities.
2.出于数学目的,计算机器应该能够提供和利用各种超越函数,例如三角函数;椭圆函数、贝塞尔函数和概率函数;以及许多其他人。幸运的是,并非所有这些函数都发生在一次计算中;因此,可以设计一种从一种功能改变到另一种功能的方法,并提供适当的灵活性。
2. For mathematical purposes, calculating machinery should be able to supply and utilize a wide variety of transcendental functions, as the trigonometric functions; elliptic, Bessel, and probability functions; and many others. Fortunately, not all these functions occur in a single computation; therefore a means of changing from one function to another may be designed and the proper flexibility provided.
3. 大多数数学计算,如函数的级数计算、公式的求值、数值积分求解微分方程等,都是由重复的过程组成的。一旦建立了一个过程,它就可以无限地继续下去,直到覆盖自变量的范围,并且通常可以通过连续的相等步骤来覆盖自变量的范围。因此,一旦建立了一个过程,为数学科学应用而设计的计算机器就应该全自动运行。
3. Most of the computations of mathematics, as the calculation of a function by series, the evaluation of a formula, the solution of a differential equation by numerical integration, etc., consist of repetitive processes. Once a process is established it may continue indefinitely until the range of the independent variables is covered, and usually the range of the independent variables may be covered by successive equal steps. For this reason calculating machinery designed for application to the mathematical sciences should be fully automatic in its operation once a process is established.
4、现有的计算机器能够逐步计算作为x的函数的ψ ( x ) 。因此,如果x定义在区间a < x < b内,并且phi ( x ) 是通过一系列算术运算从x获得的,则现有过程是对区间a <内x的所有值计算步骤 (1) x < b。然后对步骤(1)结果的所有值执行步骤(2),依此类推,直到达到ψ ( x )。然而,这个过程与许多数学运算所需的过程相反。为数学科学应用而设计的计算机器应该能够计算行而不是列,因为很多时候,就像在微分方程的数值解中一样,函数计算表中第二个值的计算取决于前面的值值或值。
4. Existing calculating machinery is capable of calculating ϕ(x) as a function of x by steps. Thus, if x is defined in the interval a < x < b and ϕ(x) is obtained from x by a series of arithmetical operations, the existing procedure is to compute step (1) for all values of x in the interval a < x < b. Then step (2) is accomplished for all values of the result of step (1), and so on until ϕ(x) is reached. This process, however, is the reverse of that required in many mathematical operations. Calculating machinery designed for application to the mathematical sciences should be capable of computing lines instead of columns, for very often, as in the numerical solution of a differential equation, the computation of the second value in the computed table of a function depends on the preceding value or values.
从根本上说,这四个功能是将现有的打孔卡计算机(例如由国际商业机器公司制造的计算机)转换为专门适合科学目的的机器所需要的。由于科学问题比会计问题更加复杂,所涉及的算术元素的数量必须大大增加。
Fundamentally, these four features are all that are required to convert existing punched-card calculating machines such as those manufactured by the International Business Machines Company into machines specially adapted to scientific purposes. Because of the greater complexity of scientific problems as compared to accounting problems, the number of arithmetical elements involved would have to be greatly increased.
自动计算器应包含的数学运算有:
The mathematical operations which should be included in an automatic calculating machine are:
1.算术的基本运算:加、减、乘、除
1. The fundamental operations of arithmetic: addition, subtraction, multiplication, and division
2. 正数和负数
2. Positive and negative numbers
3. 圆括号和中括号:( ) + ( )、[( ) + ( )] · [( ) + ( )] 等。
3. Parentheses and brackets: ( ) + ( ), [( ) + ( )] · [( ) + ( )], etc.
4. 数的幂:整数、分数
4. Powers of numbers: integral, fractional
5. 对数:以 10 为底和所有其他底数的乘法
5. Logarithms: base 10 and all other bases by multiplication
6.反对数或指数函数:以10为底和其他底数
6. Antilogarithms or exponential functions: base 10 and other bases
7. 三角函数
7. Trigonometric functions
8. 反三角函数
8. Antitrigonometric functions
9.双曲函数
9. Hyperbolic functions
10.反双曲函数
10. Antihyperbolic functions
11. 优越超越函数:概率积分、椭圆函数、贝塞尔函数
借助这些函数,要执行的过程应该是:
11. Superior transcendentals: probability integral, elliptic function, and Bessel function
With the aid of these functions, the processes to be carried out should be:
12. 公式评估和结果列表
12. Evaluation of formulae and tabulation of results
13.级数的计算
13. Computation of series
14.一阶、二阶常微分方程的解
14. Solution of ordinary differential equations of the first and second order
15.经验数据的数值积分
15. Numerical integration of empirical data
16.经验数据的数值微分
16. Numerical differentiation of empirical data
以下数学过程可以作为自动计算机器设计的基础:
The following mathematical processes may be made the basis of design of an automatic calculating machine:
1. 基本算术运算不需要注释,因为它们已经可用,除了所有其他运算最终必须简化为这些运算以便可以使用机械设备。
1. The fundamental arithmetical operations require no comment, as they are already available, save that all the other operations must eventually be reduced to these in order that a mechanical device may be utilized.
2. 幸运的是,正负号的代数非常简单。无论如何,只提供两种可能性。稍后将表明,出于机械计算的目的,这些符号可以被视为数字。
2. Fortunately the algebra of positive and negative signs is extremely simple. In any case only two possibilities are offered. Later on it will be shown that these signs may be treated as numbers for the purposes of mechanical calculation.
3. 写公式时使用圆括号和中括号要求计算必须分段进行。因此,获得了结果的一部分并且必须将其保留以待确定其他部分,等等。这意味着计算机必须配备临时存储数字的装置,直到需要进一步使用为止。这些手段可以在柜台中找到。
3. The use of parentheses and brackets in writing a formula requires that the computation must proceed piecewise. Thus, a portion of the result is obtained and must be held pending the determination of some other portion, and so on. This means that a calculating machine must be equipped with means of temporarily storing numbers until they are required for further use. Such means are available in counters.
4. 数的整数幂可以通过逐次相乘得到,分数幂可以通过迭代的方法得到。因此,如果需要找到 5 1/3,
4. Integral powers of numbers may be obtained by successive multiplication, and fractional powers by the method of iteration. Thus, if it is required to find 51/3,
和
and
或者
or
让
Let
这是 5 到 4 位有效数字的立方根。一般来说,θ的r次方根由表达式的迭代给出
which is the cube root of 5 to four significant figures. In general the rth root of θ is given by the iteration of the expression
最后,如果r不是整数,则可以求助于稍后描述的机械对数表。……
Finally, if r is not an integer, recourse may be had to the mechanical table of logarithms later to be described. …
16. 经验数据的数值积分可以通过Simpson、Weddle、Gauss等规则进行。所有这些规则都涉及y的连续值乘以指定数值系数的总和。因此,唯一涉及的新机械组件是机械引入数字列表的方法。稍后将讨论实现这一点的方法。
16. The numerical integration of empirical data may be carried out by the rules of Simpson, Weddle, Gauss, and others. All these rules involve sums of successive values of y multiplied by specified numerical coefficients. Hence the only new mechanical component involved is a means of mechanically introducing a list of numbers. Means of accomplishing this will be discussed later.
17. 经验数据的数值微分最好通过差分公式来完成。大多数实验观察结果的准确性很高,通过将观察结果足够靠近地进行观察,可以忽略第五个差异。那么,如果可以忽略第五个以上的所有差异,则可以通过诸如巴贝奇最初设计的第五个差异引擎来执行数值微分过程。然而,这样的设备可以通过标准加减法机器组装而成,只需进行一些更改。该微分装置也适用于许多其他问题。事实上,已经讨论的大多数问题在某些情况下都可以通过应用差分公式来解决。
17. Numerical differentiation of empirical data is best accomplished by means of a difference formula. Most experimental observations are of such an accuracy that fifth differences may be neglected by taking observations sufficiently close together. If, then, all differences above the fifth may be neglected, the process of numerical differentiation may be carried out by a fifth difference engine such as originally designed by Babbage. Such a device can, however, be assembled from standard addition-subtraction machines with but a few changes. The differentiating apparatus would also be applicable to many other problems. In fact, most of the problems already discussed may under certain circumstances be solved by application of difference formulae.
上一节表明,即使是复杂的数学运算也可以简化为涉及算术基本规则的重复过程。目前国际商业机器公司的计算机能够执行以下操作:
In the last section it was shown that even complicated mathematical operations may be reduced to a repetitive process involving the fundamental rules of arithmetic. At present the calculating machines of the International Business Machines Company are capable of carrying out such operations as:
在这些方程中,A、B、C、D是穿孔卡片上的数字列表,结果F也是通过穿孔卡片获得的。然后, F卡可以通过另一台机器并在另一次计算中打印或用作A、B、...卡。
In these equations A, B, C, D are tabulations of numbers on punched cards, and F, the result, is also obtained through punched cards. The F cards may then be put through another machine and printed or utilized as A, B, …, cards in another computation.
将给定机器从任何操作(7.2)更改为任何其他操作是通过插板上的电线来完成的。在熟练的操作员手中,这种改变可以在几分钟内完成。
Changing a given machine from any of the operations (7.2) to any other is accomplished by means of electrical wiring on a plug board. In the hands of a skilled operator such changes can be made in a few minutes.
这里不再进一步描述IBM机器的机制。只要说上一节中描述的所有操作都可以通过这些现有的机器来完成,只要配备适当的控制装置并组装足够的数量即可。因此,适用于数学运算的自动计算机的整个设计问题被简化为合适的控制设计问题,甚至对于简单的算术运算,该问题也已得到解决。
No further effort will be made here to describe the mechanism of the IBM machines. Suffice it to say that all the operations described in the last section can be accomplished by these existing machines when equipped with suitable controls, and assembled in sufficient number. The whole problem of design of an automatic calculating machine suitable for mathematical operations is thus reduced to a problem of suitable control design, and even this problem has been solved for simple arithmetical operations.
专用控制装置的主要特点是机器切换和用连续穿孔带替换穿孔卡。为了使切换顺序能够快速地改变为任何可能的顺序,切换机构本身应当利用纸带控制,其中数学公式可以由适当设置的穿孔来表示。
The main features of the specialized controls are machine switching and replacement of the punched cards by continuous perforated tapes. In order that the switching sequence can be changed quickly to any possible sequence, the switching mechanism should itself utilize a paper tape control in which mathematical formulae may be represented by suitable disposed perforations.
目前,自动计算器被想象为一个交换机,其上安装有各种计算机设备。配电盘的每个面板都进行一定的数学运算。
At present the automatic calculator is visualized as a switchboard on which are mounted various pieces of calculating machine apparatus. Each panel of the switchboard is given over to definite mathematical operations.
以下是所需设备的粗略轮廓:
The following is a rough outline of the apparatus required:
1. IBM 机器使用两种电势:用于电机操作的 120 伏交流电,以及用于继电器操作的 32 伏直流电等。必须提供主电源面板,包括对 110 伏交流电/32 伏交流电的控制。直流电动发电机和所有电路的足够保险丝保护。
1. IBM machines utilize two electric potentials: 120 volts ac for motor operation, and 32 volts dc for relay operation, etc. A main power supply panel would have to be provided including control for a 110-volt-ac/32-volt-dc motor generator and adequate fuse protection for all circuits.
2. 主控制面板:该控制的目的是通过机器路由数字流并开始操作。涉及的过程是: (a) 将位置( x )的数字传递到位置( y );(b) 开始位置 ( y ) 预定的操作。主控制器本身必须受联锁的约束,以防止在确定数字值之前尝试删除该数字,或者在上一个操作完成之前在位置 ( y ) 开始第二个操作。
2. Master control panel: The purpose of this control is to route the flow of numbers through the machines and to start operation. The processes involved are: (a) Deliver the number in position (x) to position (y); and (b) start the operation for which position (y) is intended. The master control must itself be subject to interlocking to prevent the attempt to remove a number before its value is determined, or to begin a second operation in position (y) before a previous operation is finished.
需要有四个这样的主控制器,每个主控制器都能够控制整个机器或其任何部件。因此,对于复杂的问题,可以将所有资源集中在一起;对于较简单的问题,需要较少的资源,并且可以同时处理多个问题。
It would be desirable to have four such master controls, each capable of controlling the entire machine or any of its parts. Thus, for complicated problems the entire resources could be thrown together; for simpler problems fewer resources are required and several problems could be in progress at the same time.
3. 任何计算中自变量的进度均等步前进,但可根据增量变化进行手动调整。获得此类算术序列的最简单方法是将第一个值x 0与增量Δ x一起提供给加法机。然后连续添加Δ x将得到所需的序列。
3. The progress of the independent variable in any calculation would go forward by equal steps subject to manual readjustment for change in the increment. The easiest way to obtain such an arithmetical sequence is to supply a first value, x0, to an adding machine, together with an increment Δx. Then successive additions of Δx will give the sequence desired.
应该有四个这样的自变量装置,以便 (a) 计算涉及四个变量的公式;(b) 独立操作四个主控制器。
There should be four such independent variable devices in order to (a) calculate formulae involving four variables; and (b) operate four master controls independently.
4. 某些常数:许多数学公式都涉及某些常数,例如e、π、 log 10 e等。这些常量应永久安装并随时可用。
4. Certain constants: many mathematical formulae involve certain constants such as e, π, log10 e, and so forth. These constants should be permanently installed and available at all times.
5. 数学公式几乎总是涉及常数。在计算作为自变量函数的公式时,这些常数会被反复使用。因此,机器应为这些常量提供 24 个可调节的数字位置。
5. Mathematical formulae nearly always involve constant quantities. In the computation of a formula as a function of an independent variable these constants are used over and over again. Hence the machine should be supplied with 24 adjustable number positions for these constants.
6. 在无穷级数的计算中,24这个数字可能会大大超过。为了解决这种情况,应该可以通过穿孔带引入特定值,通过将带向前移动一个位置来提供连续值。应提供两个这样的设备。
6. In the evaluation of infinite series the number 24 might be greatly exceeded. To take care of this case it should be possible to introduce specific values by means of a perforated tape, the successive values being supplied by moving the tape ahead one position. Two such devices should be supplied.
7. 对于非重复性操作的经验数据的引入可以通过标准穿孔卡杂志进给来最好地完成。应提供一台这样的设备。
7. The introduction of empirical data for nonrepetitive operations can be accomplished best by standard punched-card magazine feed. One such device should be supplied.
8. 在涉及圆括号和方括号的计算的各个阶段,可能有必要在计算其他部分之前保留结果的一部分。如果结果保存在计算单元中,则这些元素不可用于执行后续步骤。因此,有必要将数字从计算单元中移除并暂时存储在存储位置中。应该有十二个这样的职位。
8. At various stages of a computation involving parentheses and brackets it may be necessary to hold a part of the result pending the computation of some other part. If results are held in the calculating units, these elements are not available for carrying out succeeding steps. Therefore it is necessary that numbers may be removed from the calculating units and temporarily stored in storage positions. Twelve such positions should be available.
9. 算术的基本运算可以在三种机器上进行:加法和减法、乘法和除法。除了与超越函数直接相关的单元之外,还应提供每个单元的四个单元。
9. The fundamental operations of arithmetic may be carried on three machines: addition and subtraction, multiplication, and division. Four units of each should be supplied in addition to those directly associated with the transcendental functions.
10. 永久安装的数学函数应包括:对数、反对数、正弦、余弦、反正弦和反正切。
10. The permanently installed mathematical functions should include: logarithms, antilogarithms, sines, cosines, inverse sines, and inverse tangents.
11. 两个单元,用于 MacLauren 系列根据需要扩展其他功能。
11. Two units for MacLauren series expansion of other functions as needed.
12. 为了对经验数据进行微分和积分过程,应提供足够的加法和减法累加器来计算五次差值。
12. In order to carry out the process of differentiation and integration on empirical data, adding and subtracting accumulators should be provided sufficient to compute out to fifth differences.
13. 所有结果均应打印、随意打孔在纸带上或卡片上。最终结果将被打印。中间结果将被打孔,为进一步的计算做准备。
13. All results should be printed, punched in paper tapes, or in cards, at will. Final results would be printed. Intermediate results would be punched in preparation for further calculations.
据信,刚刚列举的由自动切换控制的装置应该能够解决所遇到的大多数问题。
It is believed that the apparatus just enumerated, controlled by automatic switching, should care for most of the problems encountered.
IBM 机器所达到的速度可以从下面的乘法表中了解,其中 2 × 8 指的是 8 位有效数字与 2 位有效数字的乘法,零不计算在内(图 7.1 )。
An idea of the speed attained by the IBM machines can be had from the following tabulation of multiplication in which 2 × 8 refers to the multiplication of an 8 significant figure number by a 2 significant figure number, zeros not counted (Figure 7.1).
图 7.1: 预期的乘法速度
Figure 7.1: Expected multiplication speed
在计算10位对数时,平均速度约为每小时90。如果需要从1000到100,000的所有自然数的10位对数,计算时间将约为1100小时,即50天,没有时间进行加法或打印。这是合理的,因为这些操作非常快并且可以在乘法时间内执行。
In the computation of 10 place logarithms the average speed would be about 90 per hour. If all the 10-place logarithms of the natural numbers from 1000 to 100,000 were required, the time of computation would be approximately 1100 hours, or 50 days, allowing no time for addition or printing. This is justified since these operations are extremely rapid and can be carried out during the multiplying time.
上述示例中使用了十位有效数字。如果所有数字都达到这一精度,则需要在大多数计算组件上提供 23 个数字位置,其中小数点左边 10 个,右边 12 个,以及一个用于加减的位置。右边的十二个中,有两个将作为守卫位置并被丢弃。
Ten significant figures have been used in the above examples. If all numbers were to be given to this accuracy it would be necessary to provide 23 number positions on most of the computing components, 10 to the left of the decimal point, 12 to the right, and one for plus and minus. Of the twelve to the right, two would be guard places and thrown away.
正如已经提到的,所有计算结果都将以表格形式打印。通过光刻,这些结果可以直接打印,无需排版或校对。这不仅表明数学函数的出版大大节省了,而且还消除了许多错误的可能性。
As already mentioned, all computed results would be printed in tabular form. By means of photolithography these results could be printed directly without type setting or proof reading. Not only does this indicate a great saving in the publishing of mathematical functions, but it also eliminates many possibilities of error.
转载自艾肯等人。(1964),经哈佛大学档案馆许可。
Reprinted from Aiken et al. (1964), with permission from the Harvard University Archives.
由开关连接的电线要么导电,要么不导电,具体取决于互连模式和开关的设置。早在 1886 年,哲学家查尔斯·桑德斯·皮尔斯 (Charles Sanders Peirce) 就认识到逻辑连接词与串联和并联电路的关系。但克劳德·香农(Claude Shannon,1916-2001)对布尔思维定律进行了令人震惊的解释。香农在密歇根州北部农村长大,曾修理过电路,并为朋友家安装了铁丝网电报机。到 20 世纪 30 年代,无线电和电话行业蓬勃发展,香农在密歇根大学学习电气工程。他在哲学课上遇到了布尔的著作,作为麻省理工学院的研究生,他在两个二进制值系统之间建立了联系,从而催生了一种管理复杂电路设计问题的极其优雅的方法。
Electric wires connected by switches either conduct electricity or don’t, depending on the interconnection pattern and the settings of the switches. As early as 1886, the philosopher Charles Sanders Peirce had recognized the relation of logical connectives to series and parallel electric circuits. But it was Claude Shannon (1916–2001) who exploited the electric interpretation of Boole’s laws of thought. Shannon had tinkered with electric circuits while growing up in rural northern Michigan and had built a barbed wire telegraph to a friend’s house. By the 1930s the radio and telephone industries were flourishing, and Shannon studied electrical engineering at the University of Michigan. He encountered Boole’s writings in a philosophy class, and as an MIT graduate student he made the connection between the two binary-valued systems—thus giving birth to a profoundly elegant methodology for managing complex circuit design problems.
今天,我们理所当然地认为电路的预期行为可以用数学方法描述,并且可以使用数学规则操纵所得公式,然后将其“编译”到硬件中。香农在他的硕士论文中是第一个这样做的。本文摘录自一篇论文,该论文成为该论文的一部分;从那时起,它对电路设计产生了巨大的影响。它涉及电路的分析和综合,其中一些电路比我们简短选择的简单电路复杂得多。在我们没有包含的一段文章中,香农的论文继续证明了某些布尔函数在这些电路形式的特定限制下的电路大小的上限和下限,从而预示了电路复杂性的丰富而重要的领域。
Today we take it for granted that the intended behavior of a circuit can be described mathematically, and that the resulting formulas can be manipulated using mathematical rules and then “compiled” into hardware. Shannon, in his Master’s thesis, was the first to do it. This selection is an excerpt from a paper that became part of that thesis; it has had immense impact on circuit design ever since. It deals with the analysis and synthesis of electrical circuits, some far more complicated than the simple ones in our brief selection. In a passage we do not include, Shannon’s paper goes on to prove upper and lower bounds on the size of circuits for certain boolean functions under particular restrictions about the form of those circuits, thus foreshadowing the rich and important field of circuit complexity.
在复杂电气系统的控制和保护电路中,经常需要对继电器触点和开关进行复杂的互连。这些电路的示例出现在自动电话交换机、工业电机控制设备以及几乎所有设计用于自动执行复杂操作的电路中。在本文中,将对此类网络的某些特性进行数学分析。将特别关注网络综合问题。给定某些特性,需要找到一种结合这些特性的电路。此类问题的解不唯一且将研究寻找需要最少数量的继电器触点和开关刀片的特定电路的方法。还将描述用于寻找在所有操作特性方面与给定电路等效的任意数量的电路的方法。将证明阻抗网络上的几个著名定理在继电器电路中具有大致相似的定理。其中值得注意的是 delta-wye 和星形网格变换以及对偶定理。
IN the control and protective circuits of complex electrical systems it is frequently necessary to make intricate interconnections of relay contacts and switches. Examples of these circuits occur in automatic telephone exchanges, industrial motor-control equipment, and in almost any circuits designed to perform complex operations automatically. In this paper a mathematical analysis of certain of the properties of such networks will be made. Particular attention will be given to the problem of network synthesis. Given certain characteristics, it is required to find a circuit incorporating these characteristics. The solution of this type of problem is not unique and methods of finding those particular circuits requiring the least number of relay contacts and switch blades will be studied. Methods will also be described for finding any number of circuits equivalent to a given circuit in all operating characteristics. It will be shown that several of the well-known theorems on impedance networks have roughly analogous theorems in relay circuits. Notable among these are the delta-wye and star-mesh transformations, and the duality theorem.
解决这些问题的方法可以简单描述如下:任何电路都由一组方程表示,方程的项对应于电路中的各种继电器和开关。微积分是为了通过简单的数学过程来处理这些方程而开发的,其中大多数与普通代数算法类似。这种演算被证明与逻辑符号研究中使用的命题演算完全相似。对于综合问题,首先将所需的特性写成方程组,然后将方程处理成代表最简单电路的形式。然后可以立即从方程中得出电路。通过这种方法,总是可以找到仅包含串联和并联连接的最简单电路,并且在某些情况下找到包含任何类型连接的最简单电路。
The method of attack on these problems may be described briefly as follows: any circuit is represented by a set of equations, the terms of the equations corresponding to the various relays and switches in the circuit. A calculus is developed for manipulating these equations by simple mathematical processes, most of which are similar to ordinary algebraic algorithms. This calculus is shown to be exactly analogous to the calculus of propositions used in the symbolic study of logic. For the synthesis problem the desired characteristics are first written as a system of equations, and the equations are then manipulated into the form representing the simplest circuit. The circuit may then be immediately drawn from the equations. By this method it is always possible to find the simplest circuit containing only series and parallel connections, and in some cases the simplest circuit containing any type of connection.
我们的符号主要取自符号逻辑。在许多常用的系统中,我们选择了一个看起来最简单、对我们的解释最具启发性的系统。我们的一些术语,例如节点、网状、三角形、星形等,是从普通网络理论中借用的,用于表示开关电路中的简单概念。
Our notation is taken chiefly from symbolic logic. Of the many systems in common use we have chosen the one which seems simplest and most suggestive for our interpretation. Some or our phraseology, such as node, mesh, delta, wye, etc., is borrowed from ordinary network theory for simple concepts in switching circuits.
我们将限制对仅包含继电器触点和开关的电路的处理,因此在任何给定时间,任何两个端子之间的电路必须是开路(无限阻抗)或闭合(零阻抗)。让我们将符号X ab或更简单地X与端子a和b相关联。这个变量是时间的函数,被称为两端电路a – b的阻碍。符号 0(零)将用于表示闭路的障碍,符号 1(单位)将用于表示开路的障碍。因此,当电路a − b开路时, X ab = 1;当电路闭合时, X ab = 0。如果每当电路a − b开路时,电路c − d为,则两个障碍X ab和X cd相等开,每当a − b闭时,c − d也闭。现在将符号+(加号)定义为表示阻抗相加的两端电路的串联。因此,当b和c连接在一起时, X ab + X cd是电路a - d的阻碍。类似地,两个障碍X ab · X cd的乘积,或者更简单地X ab X cd将被定义为表示通过并联连接电路a - b和c - d形成的电路的障碍。继电器触点或开关在电路中用图 8.1中的符号表示,字母是相应的阻碍功能。图 8.2显示了加号的解释,图 8.3显示了乘号的解释。这种符号的选择使得障碍物的处理与普通的数值代数非常相似。
We shall limit our treatment of circuits containing only relay contacts and switches, and therefore at any given time the circuit between any two terminals must be either open (infinite impedance) or closed (zero impedance). Let us associate a symbol Xab or more simply X, with the terminals a and b. This variable, a function of time, will be called the hindrance of the two-terminal circuit a − b. The symbol 0 (zero) will be used to represent the hindrance of a closed circuit, and the symbol 1 (unity) to represent the hindrance of an open circuit. Thus when the circuit a − b is open Xab = 1 and when closed Xab = 0. Two hindrances Xab and Xcd will be said to be equal if whenever the circuit a − b is open, the circuit c − d is open, and whenever a − b is closed, c − d is closed. Now let the symbol + (plus) be defined to mean the series connection of the two-terminal circuits whose hindrances are added together. Thus Xab + Xcd is the hindrance of the circuit a − d when b and c are connected together. Similarly the product of two hindrances Xab · Xcd, or more briefly XabXcd will be defined to mean the hindrance of the circuit formed by connecting the circuits a − b and c − d in parallel. A relay contact or switch will be represented in a circuit by the symbol in Figure 8.1, the letter being the corresponding hindrance function. Figure 8.2 shows the interpretation of the plus sign and Figure 8.3 the multiplication sign. This choice of symbols makes the manipulation of hindrances very similar to ordinary numerical algebra.
图 8.1: 阻碍功能符号
Figure 8.1: Symbol for hindrance function
图 8.2: 加法的解释
Figure 8.2: Interpretation of addition
图 8.3: 乘法的解释
Figure 8.3: Interpretation of multiplication
显然,根据上述定义,以下假设成立:
It is evident that with the above definitions, the following postulates will hold:
假设 Postulates |
|||
1. 1. |
A。 a. |
0·0 = 0 0 · 0 = 0 |
与闭合电路并联的闭合电路是闭合电路。 A closed circuit in parallel with a closed circuit is a closed circuit. |
b . b. |
1 + 1 = 1 1 + 1 = 1 |
开路与开路串联就是开路。 An open circuit in series with an open circuit is an open circuit. |
|
2. 2. |
A。 a. |
1 + 0 = 0 + 1 = 1 1 + 0 = 0 + 1 = 1 |
开路与闭路串联的任何顺序(即,无论开路在闭路的右侧还是左侧)都是开路。 An open circuit in series with a closed circuit in either order (i.e., whether the open circuit is to the right or left of the closed circuit) is an open circuit. |
b. b. |
0·1 = 1·0 = 0 0·1 = 1·0 = 0 |
闭路与开路以任一顺序并联都是闭路。 A closed circuit in parallel with an open circuit in either order is a closed circuit. |
|
3. 3. |
A。 a. |
0 + 0 = 0 0 + 0 = 0 |
闭合电路与闭合电路串联就是闭合电路。 A closed circuit in series with a closed circuit is a closed circuit. |
b . b. |
1·1 = 1 1 · 1 = 1 |
与开路并联的开路是开路。 An open circuit in parallel with an open circuit is an open circuit. |
|
4. 4. |
在任何给定时间,X = 0 或X = 1。 At any given time either X = 0 or X = 1. |
这些足以发展将用于仅包含串联和并联连接的电路的所有定理。这些公设成对排列,以强调加法和乘法运算与数量零和一之间的对偶关系。因此,如果在任何a假设中,将 0 替换为 1,将乘法替换为加法,反之亦然,则将得到相应的b假设。这个事实非常重要。它为每个定理提供了一个对偶定理,只需证明一个定理即可建立两个定理。这些假设中唯一与普通代数不同的是 1 b。然而,这使得这些符号的操作变得非常简单。
These are sufficient to develop all the theorems which will be used in connection with circuits containing only series and parallel connections. The postulates are arranged in pairs to emphasize a duality relationship between the operations of addition and multiplication and the quantities zero and one. Thus if in any of the a postulates the zero’s are replaced by one’s and the multiplications by additions and vice versa, the corresponding b postulate will result. This fact is of great importance. It gives each theorem a dual theorem, it being necessary to prove only one to establish both. The only one of these postulates which differs from ordinary algebra is 1b. However, this enables great simplifications in the manipulation of these symbols.
例如,要证明定理 8.4a,请注意X为 0 或 1。如果为 0,则该定理由公设 2b 得出:如果为 1,则由公设 3b 得出。定理 8.4b 现在遵循对偶原理,用 0 代替 1,用 + 代替·。
For example, to prove Theorem 8.4a, note that X is either 0 or 1. If it is 0, the theorem follows from Postulate 2b: if 1, it follows from Postulate 3b. Theorem 8.4b now follows by the duality principle, replacing the 1 by 0 and the · by +.
由于结合律(8.2a 和 8.2b),在几个项的和或乘积中可以省略括号而不会产生歧义。Σ和Π符号的使用方式与普通代数相同。
Due to the associative laws (8.2a and 8.2b) parentheses may be omitted in a sum or product of several terms without ambiguity. The Σ and Π symbols will be used as in ordinary algebra.
分配律(8.3a)使得“相乘”乘积和因式求和成为可能。然而,该定理 (8.3b) 的对偶在数值代数中并不成立。
The distributive law (8.3a) makes it possible to “multiply out” products and to factor sums. The dual of this theorem, (8.3b), however, is not true in numerical algebra.
我们现在将定义一个称为否定的新操作。阻力X的负值将写作X ′,并被定义为一个变量,当X等于 0 时等于 1,当X等于 1 时等于 0。如果X是继电器闭合触点的阻力,则X ′为同一继电器断路触点的阻碍。障碍的负数的定义给出了以下定理:
We shall now define a new operation to be called negation. The negative of a hindrance X will be written X′ and is defined to be a variable which is equal to 1 when X equals 0 and equal to 0 when X equals 1. If X is the hindrance of the make contacts of a relay, then X′ is the hindrance of the break contacts of the same relay. The definition of the negative of a hindrance gives the following theorems:
1. K类至少包含两个不同的元素。
1. The class K contains at least two distinct elements.
2. 如果a和b在K类中,则a + b在K类中。
2. If a and b are in the class K then a + b is in the class K.
3.a + b = b + a。_
3. a + b = b + a.
4. ( a + b )+ c = a +( b + c )。
4. (a + b) + c = a + (b + c).
5. a + a = a。
5. a + a = a.
6. ab + ab ' = a其中ab定义为 ( a ' + b ')'。
6. ab + ab′ = a where ab is defined as (a′ + b′)′.
如果我们让类K是由两个元素 0 和 1 组成的类,那么这些假设就来自第一部分中给出的假设。给出的假设 1、2 和 3 也可以从亨廷顿假设中推导出来。添加 4 并将我们的讨论限制在命题演算上,显然,开关电路演算与符号逻辑的这个分支之间存在完美的类比。符号的两种解释如图 8.4所示。
If we let the class K be the class consisting of the two elements 0 and 1, then these postulates follow from those given in the first section. Also Postulates 1, 2, and 3 given there can be deduced from Huntington’s postulates. Adding 4 and restricting our discussion to the calculus of propositions, it is evident that a perfect analogy exists between the calculus for switching circuits and this branch of symbolic logic. The two interpretations of the symbols are shown in Figure 8.4.
图 8.4: 命题演算和符号中继分析之间的类比
Figure 8.4: Analogue between the calculus of propositions and the symbolic relay analysis
由于这种类比,如果用中继电路来解释,命题演算的任何定理也是真定理。本节中的其余定理直接取自该领域。
Due to this analogy any theorem of the calculus of propositions is also a true theorem if interpreted in terms of relay circuits. The remaining theorems in this section are taken directly from this field.
德摩根定理:
De Morgan’s theorem:
该定理根据被加数或因子的负数给出了和或乘积的负数。通过替换所有可能的值,可以很容易地验证两项,然后通过数学归纳法扩展到任意数量的n 个变量。
This theorem gives the negative of a sum or product in terms of the negatives of the summands or factors. It may be easily verified for two terms by substituting all possible values and then extended to any number n of variables by mathematical induction.
某些变量X 1 , X 2 , … , X n的函数是由变量通过加法、乘法和求反运算形成的任何表达式。符号f ( X 1 , X 2 , … , X n ) 将用于表示函数。因此我们可能有f ( X, Y, Z ) = XY + X ′( Y ′ + Z ′)。在无穷小微积分中,表明任何函数(只要它是连续的并且所有导数都是连续的)都可以展开为泰勒级数。在命题演算中可能存在某种类似的扩展。要开发函数的级数展开式,首先请注意以下等式: [编辑: Cf. 布尔,第 44 页]
A function of certain variables X1, X2, …, Xn is any expression formed from the variables with the operations of addition, multiplication, and negation. The notation f(X1, X2, …, Xn) will be used to represent a function. Thus we might have f(X, Y, Z) = XY + X′(Y ′ + Z′). In infinitesimal calculus it is shown that any function (providing it is continuous and all derivatives are continuous) may be expanded in a Taylor series. A somewhat similar expansion is possible in the calculus of propositions. To develop the series expansion of functions first note the following equations: [EDITOR: Cf. Boole, page 44]
如果我们让X 1等于 0 或 1,这些就简化为恒等式。在这些方程中,函数f被称为关于X 1展开。( 8.10a )中的X 1和X 1的系数是 ( n - 1) 个变量X 2 , … , X n的函数,因此可以以相同的方式围绕这些变量中的任何一个进行展开。( 8.10b )中的加性项也可以以这种方式展开。展开关于X 2我们有:
These reduce to identities if we let X1 equal either 0 or 1. In these equations the function f is said to be expanded about X1. The coefficients of X1 and in (8.10a) are functions of the (n − 1) variables X2, …, Xn and may thus be expanded about any of these variables in the same manner. The additive terms in (8.10b) also may be expanded in this manner. Expanding about X2 we have:
继续这个过程n次,我们将得到完整的级数展开式,其形式为:
Continuing this process n times we will arrive at the complete series expansion having the form:
根据 ( 8.12a ),f等于通过以所有可能的方式对X 1、X 2、…、X n项上的素数进行排列所形成的乘积之和,并给每个乘积一个等于函数值的系数当该乘积为 1 时。 ( 8.12b )类似。
By (8.12a), f is equal to the sum of the products formed by permuting primes on the terms of X1, X2, …, Xn in all possible ways and giving each product a coefficient equal to the value of the function when that product is 1. Similarly for (8.12b).
作为级数展开式的应用,应该注意的是,如果我们希望找到代表任何给定函数的电路,我们总是可以通过( 8.10a)或(8.10b )来展开函数,使得任何给定变量出现在最多两次,一次作为接通接触,一次作为断开接触。如图 8.5所示。类似地,根据(8.11a)和(8.11b),任何其他变量需要出现不超过四次(两个接通和两个断开触点),等等。
As an application of the series expansion it should be noted that if we wish to find a circuit representing any given function we can always expand the function by either (8.10a) or (8.10b) in such a way that any given variable appears at most twice, once as a make contact and once as a break contact. This is shown in Figure 8.5. Similarly by (8.11a) and (8.11b) any other variable need appear no more than four times (two make and two break contacts), etc.
图 8.5: 关于一个变量的展开
Figure 8.5: Expansion about one variable
德摩根定理的推广用以下等式象征性地表示:
A generalization of De Morgan’s theorem is represented symbolically in the following equation:
我们的意思是,任何函数的负数都可以通过将每个变量替换为其负数并交换+和·符号来获得。当然,显式和隐式括号将保留在相同的位置。例如,X + Y ( Z + WX ') 的负数将为X '[ Y ' + Z '( W ' + X )]。
By this we mean that the negative of any function may be obtained by replacing each variable by its negative and interchanging the + and · symbols. Explicit and implicit parentheses will, of course, remain in the same places. For example, the negative of X + Y(Z + WX′) will be X′[Y′ + Z′(W′ + X)].
下面给出了一些有助于简化表达式的其他定理:
Some other theorems useful in simplifying expressions are given below:
所有这些定理都可以用完美归纳法来证明。
All of these theorems may be proved by the method of perfect induction.
任何由加法、乘法和求反运算形成的表达式都明确表示仅包含串联和并联连接的电路。这样的电路称为串并联电路。此类表达式中的每个字母代表一个接通或断开继电器触点,或一个开关刀片和触点。因此,为了找到需要最少触点数的电路,有必要将表达式处理成其中最少触点数的形式:出现字母。上面给出的定理总是足以做到这一点。只需要稍微练习一下这些符号的操作即可。幸运的是,大多数定理与数值代数的定理完全相同——代数的结合律、交换律和分配律在这里成立。作者发现(8.3)、(8.6)、(8.9)、(8.14)、(8.15)、(8.16a)、(8.17)和(8.18)对于简化复杂表达式特别有用。通常,一个函数可以用多种方式编写,每种方式都需要相同的最小数量的元素。在这种情况下,可以从这些电路中或从其他考虑中任意地选择电路。
Any expression formed with the operations of addition, multiplication, and negation represents explicitly a circuit containing only series and parallel connections. Such a circuit will be called a series-parallel circuit. Each letter in an expression of this sort represents a make or break relay contact, or a switch blade and contact. To find the circuit requiring the least number of contacts, it is therefore necessary to manipulate the expression into the form in which the least number of letters appear. The theorems given above are always sufficient to do this. A little practice in the manipulation of these symbols is all that is required. Fortunately most of the theorems are exactly the same as those of numerical algebra—the associative, commutative, and distributive laws of algebra hold here. The writer has found (8.3), (8.6), (8.9), (8.14), (8.15), (8.16a), (8.17), and (8.18) to be especially useful in the simplification of complex expressions. Frequently a function may be written in several ways, each requiring the same minimum number of elements. In such a case the choice of circuit may be made arbitrarily from among these, or from other considerations.
作为表达式简化的示例,请考虑图 8.6中所示的电路。该电路的阻碍函数X ab为:
As an example of the simplification of expressions consider the circuit shown in Figure 8.6. The hindrance function Xab for this circuit will be:
图 8.6: 要简化的电路
Figure 8.6: Circuit to be simplified
这些归约是通过(8.17b)首先使用W,然后使用X和Y作为( 8.17b )的“ X ”来实现的。现在相乘:
These reductions were made with (8.17b) using first W, then X and Y as the “X” of (8.17b). Now multiplying out:
与该表达式对应的电路如图8.7所示。请注意元素数量的大幅减少。……
The circuit corresponding to this expression is shown in Figure 8.7. Note the large reduction in the number of elements. …
Figure 8.7: Simplification of Figure 8.6
经麻省理工学院许可,转载自 Shannon(1938)。
Reprinted from Shannon (1938), with permission from the Massachusetts Institute of Technology.
这是一篇奇怪而精彩的论文,技术上错综复杂,预言性傲慢,是湿漉漉的神经科学和严谨的数学抽象的混合体,与以前写的任何文章都不同。它的记法很复杂,其主张也很夸张。它的目标无非就是将人脑的功能简化为数学逻辑,从而解释思想、记忆和心智。尽管它的不谦虚和天真,它是数十年来滋养计算机科学的思想源泉。
This is a strange and wonderful paper, technically convoluted and hubristically prophetic, an admixture of wet neuroscience and austere mathematical abstraction unlike anything written before. It is complex in its notation and extravagant in its pretensions. Its goal is nothing less than to reduce the functioning of the human brain to mathematical logic, and thereby to explain thought, memory, and mind. In all its immodesty and naïveté, it is the bubbling source of ideas that have nourished computer science for decades.
作者描绘了一个由两种神经元组成的神经网络。有些不接收来自其他神经元的输入;有些则不接收来自其他神经元的输入。它们是感觉数据的承载者,被称为“外周传入”。其他的是开关,可以处于两种状态之一:触发或不触发。全局时钟使系统同步,每个神经元在时间t + 1时的状态取决于其输入在时间t时的状态。神经元的输入来自外周传入神经或其他神经元的输出(“传出神经”);输入到达神经元的连接点是突触。McCulloch-Pitts 神经元i具有阈值θ i;如果超过θ i个输入激活,神经元i就会激活,但神经元也有抑制输入,如果抑制输入被激活,则神经元不会激活。在他们的神经元图中(第86页的图9.1),兴奋性输入位于三角形图的斜边,抑制性输入位于左侧的点,输出或传出来自右侧的垂直边。
The authors picture a neural network consisting of two kinds of neurons. Some receive inputs from no other neurons; they are the bearers of sensory data and are referred to as “peripheral afferents.” The others are switches, and can be in one of two states, either firing or not. A global clock synchronizes the system, and the state of each neuron at time t + 1 depends on the states of its inputs at time t. The inputs to a neuron are from peripheral afferents or from the outputs (“efferents”) of other neurons; the junction points, where the inputs arrive at a neuron, are synapses. A McCulloch–Pitts neuron i has a threshold θi; neuron i fires if more than θi of its inputs fire—except that the neuron also has an inhibitory input, and will not fire if the inhibitory input is activated. In their drawings of neurons (Figure 9.1 on page 86), the excitatory inputs are on the slanted sides of the triangular diagram, the inhibitory input is at the point on the left, and the output or efferent comes from the vertical side on the right.
图 9.1: 神经元c i总是在细胞体上标有数字i ,相应的动作用“ N ”表示, i为下标,如文中所示。
Figure 9.1: The neuron ci is always marked with the numeral i upon the body of the cell, and the corresponding action is denoted by “N” with i as subscript, as in the text.
简而言之,本文将大脑及其所有功能建模为数字系统。图 9.1中的神经元是今天所谓的阈值逻辑中的门。作者的目标是确定这样的网络可以执行哪些类型的计算。他们通过将每个神经元i与一个谓词N i ( t ) 关联起来来实现这一点,如果神经元i在时间t激发,则该谓词 N i (t) 为真。有了这个背景,在开始阅读本文之前,有必要先研究一下图 9.1中的一些图表和公式。(单点和垂直双点是括号的替代方案;它们倾向于将公式分开,两个点比一个点更强烈。单点也可以表示合取。因此(e)部分的第一行对应于N 3 ( t ) ≡ [ N 1 ( t − 1) ∨ ( N 2 ( t − 3) ∧ ∼ N 2 ( t − 2))],其中∼代表“非”,∨ 代表“或”,∧ 代表“和。”)
In short, this paper models the brain and all its functions as a digital system. The neurons of Figure 9.1 are gates in what would today be called a threshold logic. The authors’ goal is to determine what kinds of computations such a network can carry out. They do this by associating with each neuron i a predicate Ni(t) that is true if neuron i is firing at time t. With this background, it is worth working through some of the diagrams and formulas of Figure 9.1 before beginning to read the paper. (The single dots and vertical double dots are an alternative to parentheses; they tend to push formulas apart, two dots more strongly than one. A single dot can also denote conjunction. So the first line of part (e) corresponds to N3(t) ≡ [N1(t − 1) ∨ (N2(t − 3) ∧ ∼ N2(t − 2))], where ∼ stands for “not,” ∨ for “or,” and ∧ for “and.”)
论文的具体技术成就是证明神经网络可计算的谓词集合与某种极富表现力的逻辑可表达的谓词集合完全相同。特别关注具有反馈的网络,其中一个神经元的输出在影响一系列其他神经元后,作为原始神经元的输入循环返回。作者认为,这种活动周期可以解释记忆。因此,他们推断,鉴于对网络的完整描述,“对于预测而言,历史记录从来都不是必要的”(第 88 页)。大脑是我们现在所说的确定性有限状态机:它的未来完全由它当前的状态和未来的输入决定。
The specific technical accomplishment of the paper is to prove that the set of predicates computable by neural nets is exactly the same as the set of predicates expressible in a certain very expressive logic. Particular attention is given to networks with feedback, in which the output of one neuron, after affecting a series of other neurons, loops back as an input to the original neuron. Such cycles of activity, the authors suggest, explain memory. So given a complete account of the network, they reasoned, “for prognosis, history is never necessary” (page 88). The brain is what we would now call a deterministic finite-state machine: its future is completely determined by its present state and its inputs going forward.
这篇论文的公式充满了错误和不恰当——“ S ”命名了后继函数,但粗体“ S ”是一个不相关的变量,代表“句子”。Stephen Cole Kleene 简化了该模型,并于 1951 年将其用作有限自动机和正则表达式形式化的基础。觉得《逻辑微积分》很难读的读者应该会对克莱恩的看法感到放心:“本文部分是对麦卡洛克-皮茨结果的阐述;但我们发现他们论文中涉及任意神经网络的部分晦涩难懂…… ” (克莱恩,1951)。
The paper’s formulas are ridden with errors and infelicities—“S” names the successor function, but boldface “S” is an unrelated variable standing for “sentence.” Stephen Cole Kleene simplified the model and used it in 1951 as the basis for his formalization of finite automata and regular expressions. Readers who find “A Logical Calculus” tough going should be reassured by Kleene’s take on it: “The present article is partly an exposition of the McCulloch–Pitts results; but we found the part of their paper which treats of arbitrary nerve nets obscure ….” (Kleene, 1951).
很快人们就发现,麦卡洛克和皮茨所描述的数字装置对于大脑来说并不是一个糟糕的模型。“青蛙的眼睛告诉青蛙的大脑”(Lettvin et al., 1959)并不是位图这一发现似乎粉碎了皮茨对世界进行逻辑理解的希望。然而,麦卡洛克和皮茨不仅催生了有限自动机理论,而且催生了庞大的神经计算领域。他们的大胆行为得到了回报——只不过不是以他们希望的方式。
It soon became evident that the digital contraption McCulloch and Pitts described was a poor model for the brain. The discovery that “what the frog’s eye tells the frog’s brain” (Lettvin et al., 1959) was not a bitmap seemed to shatter Pitts’s hope of making logical sense of the world. And yet McCulloch and Pitts had given birth not just to finite automata theory but to the sprawling field of neural computing. Their audacity paid off—just not in the way they had hoped.
《逻辑演算》是一段非凡而悲惨的伙伴关系的产物。沃伦·麦卡洛克(Warren McCulloch,1898-1969)是一位渴望科学地理解心灵的神经科学家。作为一个成功的律师和工程师家庭的成员,麦卡洛克对主导二十世纪中叶心理学的弗洛伊德理论持怀疑态度。他接触过怀特海和罗素的数学原理的逻辑,但无法将其与他对神经解剖学和功能的理解结合起来。沃尔特·皮茨(Walter Pitts,1923-1969 年)是底特律一位虐待工人的父亲的儿子,小时候在公共图书馆找到了庇护所。他通过独自学习变得博学多才,特别是在 12 岁时阅读了《原理》 ,并与伯特兰·罗素就该书进行了通信。几年后,听说罗素要到芝加哥大学讲学,他离家出走,再也没有回来。他在大学闲逛,在那里认识了麦卡洛克。(你想到《善意狩猎》不会错。)当时 42 岁的教授麦卡洛克和 18 岁无家可归的离家出走的皮茨都读过莱布尼茨的著作,并决心发展莱布尼茨的思想演算。具有良好的数学和神经解剖学基础。这篇论文就是结果。麦卡洛克后来第一次宣称,“我们知道我们是如何知道的。” 在最后一节中,它提出,在解释了精神功能之后,精神障碍将来将被理解为特定的神经网络功能障碍。
“A Logical Calculus” is the product of an extraordinary and tragic partnership. Warren McCulloch (1898–1969) was a neuroscientist who longed to understand the mind scientifically. A member of a successful family of lawyers and engineers, McCulloch was skeptical of the Freudian theories that dominated mid-twentieth century psychology. He had encountered the logic of Whitehead and Russell’s Principia Mathematica but could not marry it to his understanding of neural anatomy and function. Walter Pitts (1923–1969), the son of an abusive working-class father in Detroit, found shelter as a boy in a public library. He became remarkably learned through solitary study, and in particular read the Principia at the age of 12 and entered into a correspondence about it with Bertrand Russell. A few years later, hearing that Russell was lecturing at the University of Chicago, he ran away from home, never to return. He hung around the University, where he met McCulloch. (You would not be wrong to think of Good Will Hunting.) McCulloch, 42 at the time and a professor, and Pitts, a homeless 18-year-old runaway, had both read Leibniz and were determined to develop a Leibnizian calculus of thought with a sound mathematical and neuroanatomical basis. This paper is the upshot. For the first time, McCulloch later declared, “we know how we know.” In the last section, it proposes that mental functioning having been explained, mental disorders would in the future be understood as specific neural net malfunctions.
唉,皮茨本人也患上了精神疾病。他和麦卡洛克最终都进入了麻省理工学院皮茨分校,尽管他从未上过高中,作为控制论先驱诺伯特·维纳(第 19 章的作者)的研究生。三人之间的私人关系破裂使皮茨陷入抑郁和酗酒的漩涡,最终于 46 岁时去世。比他大 25 岁的麦卡洛克在几个月后去世(Gefter,2015)。
Pitts, alas, himself fell prey to mental illness. Both he and McCulloch wound up at MIT—Pitts, though he had never attended high school, as a graduate student for the pioneering cybernetician Norbert Wiener (the author of chapter 19). A fracture in the personal relationship between the three men sent Pitts into a spiral of depression and alcoholism from which he died at age 46. McCulloch, 25 years his senior, died a few months later (Gefter, 2015).
由于神经活动的“全有或全无”特征,神经事件及其之间的关系可以用命题逻辑来处理。我们发现每个网络的行为都可以用这些术语来描述,并为包含圆圈的网络添加更复杂的逻辑手段;对于任何满足特定条件的逻辑表达式,我们都可以找到一个以其所描述的方式表现的网络。结果表明,可能的神经生理学假设中的许多特定选择是等效的,即对于在一种假设下表现的每个网络,都存在另一个在另一种假设下表现的网络,并给出相同的结果,尽管可能不同时。讨论了微积分的各种应用。
BECAUSE of the “all-or-none” character of nervous activity, neural events and the relations among them can be treated by means of propositional logic. It is found that the behavior of every net can be described in these terms, with the addition of more complicated logical means for nets containing circles; and that for any logical expression satisfying certain conditions, one can find a net behaving in the fashion it describes. It is shown that many particular choices among possible neurophysiological assumptions are equivalent, in the sense that for every net behaving under one assumption, there exists another net which behaves under the other and gives the same results, although perhaps not in the same time. Various applications of the calculus are discussed.
理论神经生理学依赖于某些基本假设。神经系统是一个由神经元组成的网络,每个神经元都有一个胞体和一个轴突。它们的附属物或突触总是位于一个神经元的轴突和另一个神经元的体细胞之间。在任何时刻,神经元都有某个阈值,兴奋必须超过该阈值才能启动脉冲。除了事实及其发生时间之外,这是由神经元决定的,而不是由兴奋决定的。脉冲从激发点传播到神经元的所有部分。沿轴突的速度直接随其直径变化,从通常较短的细轴突中的< 1ms -1到通常较长的粗轴突中的 > 150ms -1。因此,轴突传导的时间对于确定脉冲到达距离同一源的距离不同的点的时间并不重要。跨突触的兴奋主要发生在从轴突终止到体细胞的过程中。这是否取决于单个突触的互易性或仅仅取决于普遍的解剖结构,仍然是一个有争议的问题。假设后者不需要临时假设并解释已知的例外,但任何关于原因的假设都与即将到来的微积分兼容。目前尚不清楚通过单个突触的激发会在任何神经元中引起神经冲动,而任何神经元都可能被在潜伏添加期间到达足够数量的相邻突触的冲动所激发,该潜伏期持续< 0.25 ms。对于单个神经元来说,以更大的间隔观测到脉冲的时间总和是不可能的,并且根据经验取决于网络的结构特性。在神经元上的脉冲到达与其自身传播的脉冲之间存在 > 0.5 ms 的突触延迟。在神经冲动的第一部分,神经元对任何刺激都绝对不敏感。此后,其兴奋性迅速恢复,在某些情况下达到高于正常值,然后再次下降到低于正常值,然后缓慢恢复到正常值。频繁的活动会加剧这种不正常现象。神经冲动所具有的这种特异性仅取决于它们的时间和地点,而不取决于神经能量的任何其他特异性。最近,只有抑制被认真地引用来与这一论点相抵触。抑制是通过第二组神经元的并发或先行活动来终止或阻止一组神经元的活动。直到最近,这还可以通过以下假设来解释:第二组神经元先前的活动可能会提高内部神经元的阈值,以致它们不再被兴奋由第一组的神经元产生,而第一组的脉冲必须与这些内部的脉冲相加,以激发现在被抑制的神经元。如今,一些抑制已被证明消耗< 1 毫秒。这排除了内部神经元,并且需要突触,脉冲通过突触抑制通过其他突触受到脉冲刺激的神经元。迄今为止,实验尚未表明耐火度是相对的还是绝对的。我们将假设后者并证明这种差异对我们的论点并不重要。任何一种耐火度都可以通过两种方式来解释。“抑制性突触”可以是一种产生提高神经元阈值的物质的类型,或者它可以被放置成使得由其兴奋产生的局部干扰对抗由其他兴奋性突触引起的改变。由于已知位置在电刺激的情况下具有这种影响,因此应排除第一个假设,除非并且直到它得到证实,因为第二个假设不涉及新的假设。那么,基于相同的一般前提,我们对抑制有两种解释,只是假设的神经网络不同,因此抑制所需的时间也不同。此后我们将把这种神经网络称为广义上的等价网络。由于我们关心的是在等价条件下不变的网络属性,我们可以做出最方便微积分的物理假设。
Theoretical neurophysiology rests on certain cardinal assumptions. The nervous system is a net of neurons, each having a soma and an axon. Their adjunctions, or synapses, are always between the axon of one neuron and the soma of another. At any instant a neuron has some threshold, which excitation must exceed to initiate an impulse. This, except for the fact and the time of its occurrence, is determined by the neuron, not by the excitation. From the point of excitation the impulse is propagated to all parts of the neuron. The velocity along the axon varies directly with its diameter, from < 1ms−1 in thin axons, which are usually short, to > 150ms−1 in thick axons, which are usually long. The time for axonal conduction is consequently of little importance in determining the time of arrival of impulses at points unequally remote from the same source. Excitation across synapses occurs predominantly from axonal terminations to somata. It is still a moot point whether this depends upon irreciprocity of individual synapses or merely upon prevalent anatomical configurations. To suppose the latter requires no hypothesis ad hoc and explains known exceptions, but any assumption as to cause is compatible with the calculus to come. No case is known in which excitation through a single synapse has elicited a nervous impulse in any neuron, whereas any neuron may be excited by impulses arriving at a sufficient number of neighboring synapses within the period of latent addition, which lasts < 0.25 ms. Observed temporal summation of impulses at greater intervals is impossible for single neurons and empirically depends upon structural properties of the net. Between the arrival of impulses upon a neuron and its own propagated impulse there is a synaptic delay of > 0.5 ms. During the first part of the nervous impulse the neuron is absolutely refractory to any stimulation. Thereafter its excitability returns rapidly, in some cases reaching a value above normal from which it sinks again to a subnormal value, whence it returns slowly to normal. Frequent activity augments this subnormality. Such specificity as is possessed by nervous impulses depends solely upon their time and place and not on any other specificity of nervous energies. Of late only inhibition has been seriously adduced to contravene this thesis. Inhibition is the termination or prevention of the activity of one group of neurons by concurrent or antecedent activity of a second group. Until recently this could be explained on the supposition that previous activity of neurons of the second group might so raise the thresholds of internuncial neurons that they could no longer be excited by neurons of the first group, whereas the impulses of the first group must sum with the impulses of these internuncials to excite the now inhibited neurons. Today, some inhibitions have been shown to consume < 1 ms. This excludes internuncials and requires synapses through which impulses inhibit that neuron which is being stimulated by impulses through other synapses. As yet experiment has not shown whether the refractoriness is relative or absolute. We will assume the latter and demonstrate that the difference is immaterial to our argument. Either variety of refractoriness can be accounted for in either of two ways. The “inhibitory synapse” may be of such a kind as to produce a substance which raises the threshold of the neuron, or it may be so placed that the local disturbance produced by its excitation opposes the alteration induced by the otherwise excitatory synapses. Inasmuch as position is already known to have such effects in the cases of electrical stimulation, the first hypothesis is to be excluded unless and until it be substantiated, for the second involves no new hypothesis. We have, then, two explanations of inhibition based on the same general premises, differing only in the assumed nervous nets and, consequently, in the time required for inhibition. Hereafter we shall refer to such nervous nets as equivalent in the extended sense. Since we are concerned with properties of nets which are invariant under equivalence, we may make the physical assumptions which are most convenient for the calculus.
许多年前,我们中的一个人出于与这一论点无关的考虑,被引导认为任何神经元的反应实际上等同于提出其充分刺激的命题。因此,他试图用命题的符号逻辑的符号来记录复杂网络的行为。神经活动的“全有或全无”定律足以确保任何神经元的活动都可以表示为一个命题。神经活动之间存在的生理关系当然对应于命题之间的关系;表征的效用取决于这些关系与命题逻辑关系的同一性。对于任何神经元的每个反应,都有一个简单命题的相应断言。反过来,根据突触的配置和所讨论的神经元的阈值,这意味着一些其他简单命题或相似命题的合取(有或没有否定)的析取。出现了两个困难。第一个涉及促进和消退,其中先前的活动暂时改变对网络的一个或同一部分的后续刺激的反应。第二个涉及学习,其中先前某个时间同时发生的活动永久地改变了网络,因此以前不充分的刺激现在已经足够了。但对于经历这两种改变的网络,我们可以替换由连接和阈值未改变的神经元组成的等效虚拟网络。但必须明确一点:我们都不认为形式上的等价是一种事实解释。每反对!-我们认为促进和消光取决于与电学和化学变量相关的阈值的连续变化,例如后电位和离子浓度;学习是一种持久的变化,可以在睡眠、麻醉、抽搐和昏迷中幸存下来。形式对等的重要性在于:实际上促进、灭绝和学习决不会影响对神经网络活动进行形式化处理所得出的结论,相应命题的关系仍然是命题逻辑的关系。
Many years ago one of us, by considerations impertinent to this argument, was led to conceive of the response of any neuron as factually equivalent to a proposition which proposed its adequate stimulus. He therefore attempted to record the behavior of complicated nets in the notation of the symbolic logic of propositions. The “all-or-none” law of nervous activity is sufficient to insure that the activity of any neuron may be represented as a proposition. Physiological relations existing among nervous activities correspond, of course, to relations among the propositions; and the utility of the representation depends upon the identity of these relations with those of the logic of propositions. To each reaction of any neuron there is a corresponding assertion of a simple proposition. This, in turn, implies either some other simple proposition or the disjunction of the conjunction, with or without negation, of similar propositions, according to the configuration of the synapses upon and the threshold of the neuron in question. Two difficulties appeared. The first concerns facilitation and extinction, in which antecedent activity temporarily alters responsiveness to subsequent stimulation of one and the same part of the net. The second concerns learning, in which activities concurrent at some previous time have altered the net permanently, so that a stimulus which would previously have been inadequate is now adequate. But for nets undergoing both alterations, we can substitute equivalent fictitious nets composed of neurons whose connections and thresholds are unaltered. But one point must be made clear: neither of us conceives the formal equivalence to be a factual explanation. Per contra!—we regard facilitation and extinction as dependent upon continuous changes in threshold related to electrical and chemical variables, such as after-potentials and ionic concentrations; and learning as an enduring change which can survive sleep, anaesthesia, convulsions and coma. The importance of the formal equivalence lies in this: that the alterations actually underlying facilitation, extinction and learning in no way affect the conclusions which follow from the formal treatment of the activity of nervous nets, and the relations of the corresponding propositions remain those of the logic of propositions.
神经系统包含许多循环路径,其活动会重新产生任何参与神经元的兴奋,以至于对过去的时间的引用变得不确定,尽管它仍然意味着传入活动随着时间的推移已经实现了某一类配置中的一种。通过递归函数对这些含义的精确说明,以及对神经网络活动中可以体现的含义的确定,完成了该理论。
The nervous system contains many circular paths, whose activity so regenerates the excitation of any participant neuron that reference to time past becomes indefinite, although it still implies that afferent activity has realized one of a certain class of configurations over time. Precise specification of these implications by means of recursive functions, and determination of those that can be embodied in the activity of nervous nets, completes the theory.
我们将为我们的微积分做出以下物理假设。
We shall make the following physical assumptions for our calculus.
1. 神经元的活动是一个“全有或全无”的过程。
1. The activity of the neuron is an “all-or-none” process.
2.为了在任何时刻激发神经元,必须在潜在添加期间激发一定固定数量的突触,并且该数量与神经元先前的活动和位置无关。
2. A certain fixed number of synapses must be excited within the period of latent addition in order to excite a neuron at any time, and this number is independent of previous activity and position on the neuron.
3. 神经系统内唯一显着的延迟是突触延迟。
3. The only significant delay within the nervous system is synaptic delay.
4.任何抑制性突触的活动都绝对阻止神经元当时的兴奋。
4. The activity of any inhibitory synapse absolutely prevents excitation of the neuron at that time.
5.网络的结构不随时间改变。
5. The structure of the net does not change with time.
为了呈现这一理论,最合适的象征主义是卡尔纳普 (1937) 的语言 II 的象征主义,并补充了来自怀特海和罗素 (1910) 的各种符号,包括点的原理约定。然而,印刷上的必要性将迫使我们使用直立的“ E ”而不是倒置的“E”来表示存在运算符,并使用箭头(→)来代替马蹄形来表示暗示。我们还将使用卡尔纳普语法符号,但以粗体而不是德语字体打印它们;我们将引入一个函子S,其对于属性P 的值是当P持有其前任时持有数字的属性;它的定义为“ S ( P )( t ) ▪ ≡ ▪ P ( x ) ▪ t = x ′” [编辑:根据原始内容进行推测性更正];其参数周围的括号通常会被省略,在这种情况下,这被理解为右侧最接近的谓词表达式 [ Pr ]。此外,我们将S ( S ( Pr ))写为S 2 Pr ,等等。
To present the theory, the most appropriate symbolism is that of Language II of Carnap (1937), augmented with various notations drawn from Whitehead and Russell (1910), including the Principia conventions for dots. Typographical necessity, however, will compel us to use the upright “E” for the existential operator instead of the inverted, and an arrow (→) for implication instead of the horseshoe. We shall also use the Carnap syntactical notations, but print them in boldface rather than German type; and we shall introduce a functor S, whose value for a property P is the property which holds of a number when P holds of its predecessor; it is defined by “S(P)(t) ▪ ≡ ▪ P(x) ▪ t = x′” [EDITOR: speculatively corrected from the original]; the brackets around its argument will often be omitted, in which case this is understood to be the nearest predicate-expression [Pr] on the right. Moreover, we shall write S2Pr for S(S(Pr)), etc.
给定网络𝒩的神经元可以被指定为“ c 1 ”、“ c 2 ”、...、“ cn ”。完成此操作后,我们将用“ N ”表示一个数字的属性,即神经元c i在距时间原点的突触延迟数的时间激发,并用数字i作为下标,因此N i ( t ) 断言ci在时间t触发。N i称为c i的动作。有时我们会认为“ N ”的下标数字就好像它属于对象语言,并且位于函子参数的位置,因此它可以被数字变量[ z ]代替并被量化;这使我们能够通过使用运算符来缩写长但有限的析取和合取。对于Pr的序列,我们将非常普遍地使用这种说法;它可以通过明显的析取定义来正式保证。谓词“ N 1 ”、“ N 2 ” ……组成句法类别“ N ”。
The neurons of a given net 𝒩 may be assigned designations “c1,” “c2,” …, “cn.” This done, we shall denote the property of a number, that a neuron ci fires at a time which is that number of synaptic delays from the origin of time, by “N” with the numeral i as subscript, so that Ni(t) asserts that ci fires at the time t. Ni is called the action of ci. We shall sometimes regard the subscripted numeral of “N” as if it belonged to the object-language, and were in a place for a functoral argument, so that it might be replaced by a number-variable [z] and quantified; this enables us to abbreviate long but finite disjunctions and conjunctions by the use of an operator. We shall employ this locution quite generally for sequences of Pr; it may be secured formally by an obvious disjunctive definition. The predicates “N1,” “N2,” …, comprise the syntactical class “N.”
让我们将𝒩的外周传入神经定义为𝒩的神经元,其上没有轴突突触。令N 1 , … , N p表示这些神经元的动作,N p +1 , N p +2 , … , N n表示这些神经元的动作其余的部分。那么𝒩的解将是一类S i形式的句子:N p +1 ( z 1 ) ▪ ≡ ▪ Pr i ( N 1 , N 2 , … , N p , z 1 ) ,其中Pr i包含没有自由变量保存z 1,也没有描述性符号保存参数 [ Arg ] 中的N,可能还有一些常量句子 [ sa ];并且每个S i对于𝒩都是正确的。相反,给定 a ,不包含自由变量(除了Arg中的自由变量),我们可以说,如果存在一个网𝒩和其中的一系列N i ,使得N 1 ( z 1 ) ▪ ≡ ,则它在狭义上是可实现的▪ Pr 1 ( N 1 , N 2 , … , z 1 , sa 1 ) 成立,其中sa 1 的形式为N (0)。如果对于某些n,S n ( Pr 1 )( p 1 , … , p p , z 1 , s ) 在上述意义上是可实现的,我们将称其为扩展意义上的可实现,或者简称为可实现。c pi在这里是实现神经元。我们将谈到神经兴奋的两个定律,这两个定律使得每一个在一个假设上在任一意义上可实现的S也可以在另一个假设上通过不同的网络实现,在这个意义上,它们是等价的假设。
Let us define the peripheral afferents of 𝒩 as the neurons of 𝒩 with no axons synapsing upon them. Let N1, …, Np denote the actions of such neurons and Np+1, Np+2, …, Nn those of the rest. Then a solution of 𝒩 will be a class of sentences of the form Si: Np+1(z1) ▪ ≡ ▪ Pri(N1, N2, …, Np, z1), where Pri contains no free variable save z1 and no descriptive symbols save the N in the argument [Arg], and possibly some constant sentences [sa]; and such that each Si is true of 𝒩. Conversely, given a , containing no free variable save those in its Arg, we shall say that it is realizable in the narrow sense if there exists a net 𝒩 and a series of Ni in it such that N1(z1) ▪ ≡ ▪ Pr1(N1, N2, …, z1, sa1) is true of it, where sa1 has the form N (0). We shall call it realizable in the extended sense, or simply realizable, if for some n, Sn(Pr1)(p1, …, pp, z1, s) is realizable in the above sense. cpi is here the realizing neuron. We shall say of two laws of nervous excitation which are such that every S which is realizable in either sense upon one supposition is also realizable, perhaps by a different net, upon the other, that they are equivalent assumptions, in that sense.
以下关于可实现性的定理都是指扩展意义上的。在某些情况下,可以获得关于窄可实现性的更清晰的定理;但是,除了陈述更加复杂之外,这几乎没有实际价值,因为我们目前的神经生理学知识仅将激发定律确定为扩展等效性,并且根据我们做出的可能假设,更精确的定理会有所不同。然而,我们不太精确的定理在等价条件下是不变的,并且仍然足以满足脉冲通过整个网络的确切时间并不重要的所有目的。
The following theorems about realizability all refer to the extended sense. In some cases, sharper theorems about narrow realizability can be obtained; but in addition to greater complication in statement this were of little practical value, since our present neurophysiological knowledge determines the law of excitation only to extended equivalence, and the more precise theorems differ according to which possible assumption we make. Our less precise theorems, however, are invariant under equivalence, and are still sufficient for all purposes in which the exact time for impulses to pass through the whole net is not crucial.
我们的中心问题现在可以准确地表述:首先,找到一种有效的方法来获得构成任何给定网络的解的一组可计算的S;其次,以有效的方式描述可实现的S的类别。实质上来说,问题是计算任何网络的行为,并找到一个在存在这样的网络时将以指定方式运行的网络。
Our central problems may now be stated exactly: first, to find an effective method of obtaining a set of computable S constituting a solution of any given net; and second, to characterize the class of realizable S in an effective fashion. Materially stated, the problems are to calculate the behavior of any net, and to find a net which will behave in a specified way, when such a net exists.
如果一个网络包含一个圆,则该网络将被称为循环网络,即如果其上存在一条神经元链c i、c i +1、...,该链的每个成员与下一个成员突触,具有相同的开始和结束。如果一组神经元c 1 , c 2 , … , cp使得它从𝒩中移除后没有圆圈,并且没有更小的神经元类具有此属性,则该集合称为循环集,其基数为𝒩的顺序。_ 正如我们将看到的,从重要意义上讲,网络的阶数是其行为复杂性的指标。特别是,零阶网络具有特别简单的属性;我们将首先讨论它们。
A net will be called cyclic if it contains a circle, i.e. if there exists a chain ci, ci+1, … of neurons on it, each member of the chain synapsing upon the next, with the same beginning and end. If a set of its neurons c1, c2, …, cp is such that its removal from 𝒩 leaves it without circles, and no smaller class of neurons has this property, the set is called a cyclic set, and its cardinality is the order of 𝒩. In an important sense, as we shall see, the order of a net is an index of the complexity of its behaviour. In particular, nets of zero order have especially simple properties; we shall discuss them first.
让我们通过以下递归定义一个时间命题表达式(TPE),指定一个时间命题函数(TPF )。
Let us define a temporal propositional expression (a TPE), designating a temporal propositional function (TPF), by the following recursion.
1. A 1 p 1 [ z 1 ] 是一个TPE,其中p 1是谓词变量。
1. A 1p1[z1] is a TPE, where p1 is a predicate-variable.
2. 如果S 1和S 2是包含相同自由个体变量的TPE,则S S 1、S 1 ∨ S 2、S 1 ▪ S 2和S 1 ▪ ∼ S 2也是如此。
2. If S1 and S2 are TPE containing the same free individual variable, so are SS1, S1 ∨S2, S1 ▪ S2, and S1 ▪ ∼ S2.
3.其他都不是TPE。
3. Nothing else is a TPE.
定理1.每个 0 阶网络都可以用时间命题表达式来求解。
THEOREM 1. Every net of order 0 can be solved in terms of temporal propositional expressions.
令c i为𝒩的任意神经元,阈值θ i > 0,并令c i 1 , c i 2 , … , c ip上分别有n i 1 , n i 2 , … , n ip兴奋性突触。令c j 1 , c j 2 , … , c jq上有抑制性突触。令κ i为 { ni 1 , ni 2 , … , n ip }子类的集合,使得它们的成员之和超过θ i。然后,根据上述假设,我们可以写出:
Let ci be any neuron of 𝒩 with a threshold θi > 0, and let ci1, ci2, …, cip have respectively ni1, ni2, …, nip excitatory synapses upon it. Let cj1, cj2, …, cjq have inhibitory synapses upon it. Let κi be the set of the subclasses of {ni1, ni2, …, nip} such that the sum of their members exceeds θi. We shall then be able to write, in accordance with the assumptions mentioned above:
其中“Σ”和“'∏”是析取和连取的语法符号,在每种情况下都是有限的。由于可以为每个不是外周传入的c i编写这种形式的表达式,因此我们可以通过将( 9.1)中的相应表达式替换为每个N jm或N is ,其神经元不是外周传入,并重复对结果进行处理,最终仅根据外围传入N得出N i的表达式,因为𝒩没有圆圈。此外,这个表达式将是一个TPE,因为显然 ( 9.1 ) 是;从定义可以直接得出,用TPE代替TPE中的成分p ( z )的结果也是 1。
where the “∑” and “‘∏” are syntactical symbols for disjunctions and conjunctions which are finite in each case. Since an expression of this form can be written for each ci which is not a peripheral afferent, we can, by substituting the corresponding expression in (9.1) for each Njm or Nis whose neuron is not a peripheral afferent, and repeating the process on the result, ultimately come to an expression for Ni in terms solely of peripherally afferent N, since 𝒩 is without circles. Moreover, this expression will be a TPE, since obviously (9.1) is; and it follows immediately from the definition that the result of substituting a TPE for a constituent p(z) in a TPE is also one.
定理2.每个TPE都可以通过零阶网络实现。
THEOREM 2. Every TPE is realizable by a net of order zero.
函子S显然可以与析取、合取和否定进行交换。显然,用任何狭义可实现的S i (ins) 替换可实现表达式S 1中的p ( z ),其结果本身是可实现的ins;通过用S i网络中的实现神经元替换S 1网络中的外围传入神经元来构建实现网络。一个神经元网络实现p 1 ( z 1 ) ins,图 9.1a显示了一个实现S p 1 ( z 1 ) 的网络,因此实现S S 2 , ins,如果S 2可以实现 ins 现在如果S 2和S 3可实现,则对于合适的m和n , S m S 2和S n S 3可实现。因此S m + n S 2和S m + n S 3也是如此。现在图 9.1b-d的网络分别实现了S ( p 1 ( z 1 ) ∨ p 2 ( z 1 ))、S ( p 1 ( z 1 ) ▪ p 2 ( z 1 )) 和S ( p 1 ( z 1 ) ▪ ∼ p 2 ( z 1 )) ins 因此S m + n +1 ( S 1 ∨ S 2 )、S m + n +1 ( S 1 ▪ S 2 ) 和S m + n +1 ( S 1 ▪ ∼ S 2 ) 可在 ins 内实现 因此S 1 ∨ S 2,如果S 1 和 S 2成立,则S 1 ▪ S 2、S 1 ▪ ∼ S 2是可实现的。通过完全归纳,所有的TPE都是可以实现的。通过这种方式,所有网络都可以被视为是由图 9.1a-d的基本元素构建而成的,正如时间命题表达式是由进动、析取、合取和联合否定运算生成的一样。特别是,对应于状态的任何描述,或者网络中所有神经元的动作的真值和假值的分布,除了使它们全部为假之外,单个神经元是可构造的,其激发是该描述的有效性。此外,总是存在无限数量的拓扑不同的网络来实现任何TPE。……
The functor S obviously commutes with disjunction, conjunction, and negation. It is obvious that the result of substituting any Si, realizable in the narrow sense (i.n.s.), for the p(z) in a realizable expression S1 is itself realizable i.n.s.; one constructs the realizing net by replacing the peripheral afferents in the net for S1 by the realizing neurons in the nets for the Si. The one neuron net realizes p1(z1) i.n.s., and Figure 9.1a shows a net that realizes Sp1(z1) and hence SS2, i.n.s., if S2 can be realized i.n.s. Now if S2 and S3 are realizable then SmS2 and SnS3 are realizable i.n.s., for suitable m and n. Hence so are Sm+nS2 and Sm+nS3. Now the nets of Figures 9.1b–d respectively realize S(p1(z1) ∨p2(z1)), S(p1(z1) ▪ p2(z1)), and S(p1(z1) ▪ ∼ p2(z1)) i.n.s. Hence Sm+n+1(S1 ∨S2), Sm+n+1(S1 ▪ S2), and Sm+n+1(S1 ▪ ∼ S2) are realizable i.n.s. Therefore S1 ∨S2, S1 ▪ S2, S1 ▪ ∼ S2 are realizable if S1 and S2 are. By complete induction, all TPE are realizable. In this way all nets may be regarded as built out of the fundamental elements of Figures 9.1a–d, precisely as the temporal propositional expressions are generated out of the operations of precession, disjunction, conjunction, and conjoined negation. In particular, corresponding to any description of state, or distribution of the values true and false for the actions of all the neurons of a net save that which makes them all false, a single neuron is constructible whose firing is a necessary and sufficient condition for the validity of that description. Moreover, there is always an indefinite number of topologically different nets realizing any TPE. …
学习现象具有在神经活动的大多数生理变化中持续存在的特征,似乎需要网络结构发生永久性改变的可能性。最简单的这种改变是新突触的形成或等效的局部阈值降低。我们假设某些轴突末端一开始不能刺激随后的神经元;但是,如果神经元在任何时候放电,并且轴突末端同时被激发,它们就会变成普通类型的突触,从此能够激发神经元。抑制性突触的丧失给出了完全等同的结果。然后我们将有
The phenomena of learning, which are of a character persisting over most physiological changes in nervous activity, seem to require the possibility of permanent alterations in the structure of nets. The simplest such alteration is the formation of new synapses or equivalent local depressions of threshold. We suppose that some axonal terminations cannot at first excite the succeeding neuron; but if at any time the neuron fires, and the axonal terminations are simultaneously excited, they become synapses of the ordinary kind, henceforth capable of exciting the neuron. The loss of an inhibitory synapse gives an entirely equivalent result. We shall then have
定理7.可改变的突触可以用圆圈代替。
THEOREM 7. Alterable synapses can be replaced by circles.
这是通过图 9.1i的方法完成的。还需要指出的是,变得并保持自发活动的神经元同样可以用圆圈代替,该圆圈当活动开始时,它被外周传入神经启动,当活动结束时,被外周传入神经抑制。[编辑:在图 9.1中,(f) 顶部的表达式已被更正,而 (i) 顶部的表达式在原始版本中缺失。(g) 部分很神秘——下面的图是 (d) 的一个版本,但 (g) 的表达式与 (g) 的图的任何部分都不匹配。]
This is accomplished by the method of Figure 9.1i. It is also to be remarked that a neuron which becomes and remains spontaneously active can likewise be replaced by a circle, which is set into activity by a peripheral afferent when the activity commences, and inhibited by one when it ceases. [EDITOR: In Figure 9.1, the expression for top part of (f) has been corrected and the expression for top part of (i) is missing in the original. Part (g) is mysterious—the bottom diagram is a version of (d), but the expression for (g) matches neither part of the diagram for (g).]
对于不满足我们之前的无圆假设的网络的处理比这种情况困难得多。这很大程度上是由于以下可能性的结果:活动可能在一个回路中建立并在无限期的时间内继续在其周围回响,因此可实现的 Pr 可能涉及对无限遥远程度的过去事件的参考。… [编辑:表达式( Ex ) t − 1 ▪ N 1 ( x ) ▪ N 2 ( x ) (图 9.1的 (i) 部分)表示时间上有一个点x,不晚于时间t − 1,当N 1 ( x ) 和N 2 ( x ) 都为真时。未标记的神经元(自我反馈)将无限期地继续放电。]
The treatment of nets which do not satisfy our previous assumption of freedom from circles is very much more difficult than that case. This is largely a consequence of the possibility that activity may be set up in a circuit and continue reverberating around it for an indefinite period of time, so that the realizable Pr may involve reference to past events of an indefinite degree of remoteness. …[EDITOR: The expression (Ex) t − 1 ▪ N1(x) ▪ N2(x) (in part (i) of Figure 9.1) means that there was a point x in time, no later than time t − 1, when N1(x) and N2(x) were both true. The unlabeled neuron—which feeds back on itself—will keep firing indefinitely.]
最后还要指出一件事。很容易证明:首先,每个网络如果配有磁带、连接传入神经的扫描仪以及执行必要的运动操作的合适传出神经,则只能计算图灵机那样的数字;其次,后面的每个数字都可以通过这样的网络来计算;并且可以通过这样的网络来计算带有圆圈的网络;带有圆圈的网络可以在没有扫描仪和磁带的情况下计算机器可以计算的一些数字,但不能计算其他数字,也不能计算全部数字。这很有趣,因为它为可计算性的图灵定义及其等价物、丘奇的λ可定义性和克莱恩的原始递归性提供了心理学上的证明:如果有机体可以计算任何数字,那么它就可以通过这些定义进行计算,反之亦然。
One more thing is to be remarked in conclusion. It is easily shown: first, that every net, if furnished with a tape, scanners connected to afferents, and suitable efferents to perform the necessary motor-operations, can compute only such numbers as can a Turing machine; second, that each of the latter numbers can be computed by such a net; and that nets with circles can be computed by such a net; and that nets with circles can compute, without scanners and a tape, some of the numbers the machine can, but no others, and not all of them. This is of interest as affording a psychological justification of the Turing definition of computability and its equivalents, Church’s λ-definability and Kleene’s primitive recursiveness: if any number can be computed by an organism, it is computable by these definitions, and conversely.
因果关系需要对状态的描述以及与它们相关的必然联系法则,它在多种科学中以多种形式出现,但除了统计学之外,它从来没有像这一理论中那样具有互反性。对任何一次传入刺激和所有组成神经元活动的规范(每一个都是“全有或全无”事件)决定了状态。神经网络的规范提供了必要连接法则,人们可以根据任何状态的描述来计算后续状态,但是包含析取关系会阻止对前一个状态的完全确定。此外,构成圈的再生活动使得过去的时间变得不确定。因此,我们对世界的认识,包括我们自己,在空间上是不完整的,在时间上是不确定的。这种无知隐含在我们所有的大脑中,是使我们的知识变得有用的抽象的对应物。大脑在确定我们的理论与我们的观察以及这些理论与事实的认知关系方面的作用是非常清楚的,因为很明显,每一个想法和每一种感觉都是通过该网络内的活动来实现的,并且没有这样的活动是可以实现的。实际的传入完全确定。
Causality, which requires description of states and a law of necessary connection relating them, has appeared in several forms in several sciences, but never, except in statistics, has it been as irreciprocal as in this theory. Specification for any one time of afferent stimulation and of the activity of all constituent neurons, each an “all-or-none” affair, determines the state. Specification of the nervous net provides the law of necessary connection whereby one can compute from the description of any state that of the succeeding state, but the inclusion of disjunctive relations prevents complete determination of the one before. Moreover, the regenerative activity of constituent circles renders reference indefinite as to time past. Thus our knowledge of the world, including ourselves, is incomplete as to space and indefinite as to time. This ignorance, implicit in all our brains, is the counterpart of the abstraction which renders our knowledge useful. The role of brains in determining the epistemic relations of our theories to our observations and of these to the facts is all too clear, for it is apparent that every idea and every sensation is realized by activity within that net, and by no such activity are the actual afferents fully determined.
如果网络发生改变,我们可以持有的理论和我们可以做出的任何观察都无法保留其对事实的旧有缺陷的参考。耳鸣、感觉异常、幻觉、妄想、混乱和迷失方向会介入。因此,经验证实,如果我们的网络是不确定的,那么我们的事实也是不确定的,并且我们不能将“真实”归因于一种品质或“形式”。有了网络的决定,不可知的知识对象,即“事物本身”,就不再是不可知的。
There is no theory we may hold and no observation we can make that will retain so much as its old defective reference to the facts if the net be altered. Tinitus, paraesthesias, hallucinations, delusions, confusions and disorientation intervene. Thus empiry confirms that if our nets are undefined, our facts are undefined, and to the “real” we can attribute not so much as one quality or “form.” With determination of the net, the unknowable object of knowledge, the “thing in itself,” ceases to be unknowable.
对于心理学来说,无论如何定义,网络的规范都将贡献该领域所能实现的一切——即使分析被推向最终的心理单位或“psychons”,因为一个psychons可能不亚于单个神经元的活动。由于这种活动本质上是命题性的,所以所有的心理事件都具有有意的或“符号学”的特征。这些活动的“全有或全无”法则,以及它们的关系与命题逻辑关系的一致性,确保了心理关系是命题二值逻辑的关系。因此,在心理学、内省心理学、行为主义心理学或生理学中,基本关系是二值逻辑的关系。
To psychology, however defined, specification of the net would contribute all that could be achieved in that field—even if the analysis were pushed to ultimate psychic units or “psychons,” for a psychon can be no less than the activity of a single neuron. Since that activity is inherently propositional, all psychic events have an intentional, or “semiotic,” character. The “all-or-none” law of these activities, and the conformity of their relations to those of the logic of propositions, insure that the relations of psychons are those of the two-valued logic of propositions. Thus in psychology, introspective, behavioristic or physiological, the fundamental relations are those of two-valued logic.
因此,出现了整体问题的结构性解决方案,涉及感官意识的差异化连续体以及感知和执行的规范性、完善性和解决性属性。从因果关系的不互易性可以看出,即使网络是已知的,尽管我们可以从当前的活动预测未来,但我们既不能从中枢推断出传入,也不能从传出中推断出中枢,也不能从当前的活动中推断出过去——这些结论被矛盾的事实所强化。目击者的证词,难以区分诊断器质性疾病、歇斯底里症和装病者,以及将自己的记忆或回忆与同时代的记录进行比较。此外,系统对再生网络的传入物与该网络内的某些活动之间的差异作出反应,以减少差异,从而表现出有目的的行为。众所周知,生物体拥有许多这样的系统,促进体内平衡、食欲和注意力。因此,我们通常所说的心理活动的形式和最终方面都可以从当前的神经生理学中严格推论出来。精神病学家可能会从关于因果关系的明显结论中得到安慰——对于预后来说,病史从来都不是必要的。他无法从同样有效的结论中得出什么结论,即他的观察结果只能用神经活动来解释,而直到最近,神经活动仍然超出了他的知识范围。这种无知的症结在于,从任何公开行为样本对神经网络的推论都不是唯一的,而在可想象的网络中,实际上只有一个存在,并且可能在任何时刻表现出某种不可预测的活动。当然,对于精神病学家来说,更重要的是,在这样的系统中,“心灵”不再“比幽灵更幽灵”。相反,用神经生理学的科学术语来说,可以在不丧失范围或严谨性的情况下理解病态心态。对于神经学来说,该理论加深了特定活动所必需或仅仅足够的网络之间的区别,从而阐明了受干扰的结构与受干扰的功能的关系。在其自身领域,等效网络和狭义网络等效之间的差异表明了神经活动时间研究的适当使用和重要性:并且对于数学生物物理学,该理论提供了一种对已知网络进行严格符号处理的工具以及一种简单的方法构建所需属性的假设网络。
Hence arise constructional solutions of holistic problems involving the differentiated continuum of sense awareness and the normative, perfective and resolvent properties of perception and execution. From the irreciprocity of causality it follows that even if the net be known, though we may predict future from present activities, we can deduce neither afferent from central, nor central from efferent, nor past from present activities—conclusions which are reinforced by the contradictory testimony of eye-witnesses, by the difficulty of diagnosing differentially the organically diseased, the hysteric and the malingerer, and by comparing one’s own memories or recollections with his contemporaneous records. Moreover, systems which so respond to the difference between afferents to a regenerative net and certain activity within that net, as to reduce the difference, exhibit purposive behavior; and organisms are known to possess many such systems, subserving homeostasis, appetition and attention. Thus both the formal and the final aspects of that activity which we are wont to call mental are rigorously deducible from present neurophysiology. The psychiatrist may take comfort from the obvious conclusion concerning causality—that, for prognosis, history is never necessary. He can take little from the equally valid conclusion that his observables are explicable only in terms of nervous activities which, until recently, have been beyond his ken. The crux of this ignorance is that inference from any sample of overt behavior to nervous nets is not unique, whereas, of imaginable nets, only one in fact exists, and may, at any moment, exhibit some unpredictable activity. Certainly for the psychiatrist it is more to the point that in such systems “Mind” no longer “goes more ghostly than a ghost.” Instead, diseased mentality can be understood without loss of scope or rigor, in the scientific terms of neurophysiology. For neurology, the theory sharpens the distinction between nets necessary or merely sufficient for given activities, and so clarifies the relations of disturbed structure to disturbed function. In its own domain the difference between equivalent nets and nets equivalent in the narrow sense indicates the appropriate use and importance of temporal studies of nervous activity: and to mathematical biophysics the theory contributes a tool for rigorous symbolic treatment of known nets and an easy method of constructing hypothetical nets of required properties.
经 Springer 许可,由 McCulloch 和 Pitts (1943) 转载。
Reprinted from McCulloch and Pitts (1943), with permission from Springer.
本报告中描述但未命名的所谓“冯·诺依曼架构”具有图灵(1936 年,此处第 6 章)描述的“通用计算机”的逻辑结构,尽管这种相似性似乎有些巧合(参见第十八页)。保存程序运行数据的同一存储器也保存程序本身(第 94、104 页)。艾伦·图灵在他的自动计算引擎计划中引用了这份报告,但没有引用他自己的理论工作(图灵,1945)。
The so-called “von Neumann architecture” described but not named in this report has the logical structure of the “universal computing machine” described by Turing (1936, here chapter 6), though the similarity seems to be something of a coincidence (see page xviii). The same memory that holds the data on which a program operates holds the program itself (pages 94, 104). Alan Turing cites this report but not his own theoretical work in his plan for an Automatic Calculating Engine (Turing, 1945).
1944 年,宾夕法尼亚大学摩尔学院的 Presper Eckert 和 John Mauchly 领导的团队中的两名成员 Arthur Burks(1915-2008)和 Herman Goldstine(1913-2004)设计了一台名为 EDVAC 的电子计算机,该计算机是其后继者埃克特-莫奇利 ENIAC 已经在为美国陆军进行弹道计算。在 Goldstine 偶然遇见了著名数学家约翰·冯·诺依曼(John von Neumann,1903-1957)并向他介绍了 EDVAC 项目后,冯·诺依曼加入了该小组。约翰·冯·诺依曼 (John von Neumann) 在这份备忘录中写下了这一设计,戈德斯坦 (Goldstine) 于 1945 年打印并发布了这份备忘录,其中仅列出了冯·诺依曼 (von Neumann) 作为作者。令人遗憾的是:该文档是冯·诺依曼的,但设计是该小组的,并且存储的程序已经是 ENIAC 的一部分(尽管在该机器中,该程序驻留在只读存储器中)。
In 1944 Arthur Burks (1915–2008) and Herman Goldstine (1913–2004) were two members of the team led by Presper Eckert and John Mauchly designing an electronic computer known as the EDVAC at the Moore School of the University of Pennsylvania—the successor to the Eckert–Mauchly ENIAC that was already grinding out ballistics calculations for the U.S. Army. After Goldstine happened to meet the eminent mathematician John von Neumann (1903–1957) and told him about the EDVAC project, von Neumann joined the group. John von Neumann wrote up the design in this memo, which Goldstine typed and released in 1945, listing only von Neumann as author. That was regrettable: the document was von Neumann’s, but the design was the group’s, and the stored program was already part of the ENIAC (though in that machine, the program resided in read-only memory).
图10.1显示了打字稿的封面页,非常粗糙。在可能的情况下,我们提供了原文中的部分交叉引用,留待以后填写。1946 年,一份修订后的报告(Burks 等人,1947 年)发表,题为“电子计算仪器逻辑设计的初步讨论”,并添加了 Goldstine 和 Burks 作为合著者。
Figure 10.1 shows the cover page of the typescript, which is very rough. Where possible, we have supplied section cross-references that in the original were left to be filled in later. A revised report (Burks et al., 1947) was issued in 1946 under the name “Preliminary discussion of the logical design of an electronic computing instrument,” adding Goldstine and Burks as co-authors.
图 10.1: “初稿”打字稿的标题页
Figure 10.1: Title page of “First draft” typescript
这些报告为计算机设计确定了方向,最终使软件行业成为可能,因为程序可以像任何其他类型的数据一样被处理并加载到内存中。报告使用二进制记数法将设备分析为内存和控制单元(“器官”),后面的“初步讨论”增加了对内存地址(“位置编号”)、各种寄存器、浮点表示和使用罗马字母表的前六个字母作为十六进制数字。但从短期来看,报告未能承认其他人的贡献,引起了设计团队创始成员的不满,尤其是 Mauchly,他起草了 EDVAC 及其前身机器 ENIAC 获得资助的提案,并且设计的。战争结束后,1947 年,莫奇利和埃克特中断了业务,开始了计算机业务,申请了专利,但由于本报告的提前发布,这些专利最终被宣告无效。(埃克特-莫奇利计算机公司后来的历史在第 169 页有概述。)伯克斯和戈德斯坦追求学术生涯,加入普林斯顿大学的冯·诺依曼。
These reports set the direction for computer design that would eventually make the software industry possible, since programs could be processed and loaded into memory like any other kind of data. The reports analyze the device into memory and control units (“organs”) using binary notation, and the later “Preliminary discussion” adds explanations of memory addresses (“location-numbers”), registers of various kinds, floating-point representation, and use of the first six letters of the Roman alphabet as hexadecimal digits. But in the short run, the reports’ failure to acknowledge the contributions of others caused resentment among the founding members of the design team—in particular Mauchly, who had authored the proposal under which the EDVAC and its predecessor machine the ENIAC had been funded and designed. After the war, Mauchly and Eckert broke off to start a computer business, in 1947 filing patents that were ultimately invalidated because of the prior release of this report. (The subsequent history of the Eckert–Mauchly Computer Corporation is sketched on page 169.) Burks and Goldstine pursued academic careers, joining von Neumann at Princeton.
当冯·诺依曼加入 EDVAC 团队时,他已经是世界上最有成就的数学家之一,为众多数学领域、数理经济学和量子力学做出了贡献。他在匈牙利出生并接受教育,1930 年移居柏林,成为大卫·希尔伯特 (David Hilbert) 周围圈子的一员(见第 5 章)。他在第二次世界大战爆发前移居美国,是设计原子弹的曼哈顿计划的重要贡献者。他亲眼目睹了数百台人类“计算机”使用单独的机械计算器来求解他在裂变炸弹设计中得出的方程,尽管他无法与他在摩尔学院工作的小组讨论这项机密工作在普林斯顿大学,这为他开发高速自动计算机的工作提供了动力。
By the time von Neumann joined the EDVAC team, he was already one of the most accomplished mathematicians in the world, having made contributions to a great variety of mathematical fields, to mathematical economics, and to quantum mechanics. Born and educated in Hungary, he moved to Berlin in 1930 to be part of the circle surrounding David Hilbert (see chapter 5). He emigrated to the US before the outbreak of the Second World War and was an important contributor to the Manhattan Project, which designed the atomic bomb. He witnessed the work of hundreds of human “computers” using individual mechanical calculators to solve the equations he had worked out in the design of a fission bomb, and though he could not discuss that classified work with the groups he worked with at the Moore School or at Princeton, it provided motivation for his work developing high-speed automatic computers.
“初稿”中的一些术语是特定于实施的,令人困惑。存储器采用延迟线的形式,以我们今天所说的 32 位字进行组织——冯·诺依曼将其称为“小周期”。32 位中的一位是一个标志,用于指示该字是代表数字还是代表指令(“顺序”)。数字使用另一位作为符号,其余位代表 - 1 和 1 之间的二进制分数。32 个小周期构成一个“主周期”或延迟线器官 (DLA)。整个存储器由 256 个 DLA 组成,因此,例如,要指定存储器中数字的地址,需要 8 位来标识 DLA,并需要另外 5 位来选出小周期。
Some of the terminology of the “First draft” is confusingly implementation-specific. The memory was in the form of a delay line, organized in what we would today call 32-bit words—von Neumann called them “minor cycles.” One of the 32 bits was a flag to indicate whether the word represented a number or an instruction (“order”). Numbers used another bit for the sign, with the remaining bits representing a binary fraction between − 1 and 1. 32 minor cycles constituted a “major cycle” or delay line organ (DLA). The full memory comprised 256 DLAs, so to specify, for example, the address of a number in memory required 8 bits to identify the DLA and 5 more bits to single out the minor cycle.
为了达到一定程度的设计抽象,“初稿”提出了一个可能具有多种物理实现的双稳态“元素”。在第10.4.2节中,冯·诺依曼详细引用了麦卡洛克和皮茨,但他没有提到图灵。在省略的部分中,报告继续指出“刺激自身的元素将无限期地保持刺激”,正如“逻辑演算”中对神经元所说的那样(第 87 页)。冯·诺依曼使用的同步电路时钟让人想起麦卡洛克和皮茨施加在神经元电路上的离散时间变量,以使它们的行为接受逻辑分析。事实上,真空管和神经元之间的类比在设计中并没有被大量使用,尽管冯·诺依曼对大脑计算能力的推测很感兴趣,而且 EDVAC 的程序存储在快速存储器中,而不是存储在快速存储器中。像艾肯的 Mark I 那样,存储在卡或磁带上,这样就可以以电子速度访问连续的指令。
To achieve a level of design abstraction, the “First draft” posits a bistable “element” that might have a variety of physical realizations. In §10.4.2, von Neumann cites McCulloch and Pitts at some length, but he does not mention Turing. In an omitted section, the report goes on to note that “an element which stimulates itself will hold a stimulus indefinitely,” exactly as “A Logical Calculus” had said of neurons (page 87). And von Neumann’s use of circuitry synchronized by a clock is reminiscent of the discrete time variable McCulloch and Pitts imposed on neuronal circuits to subject their behavior to logical analysis. In fact the analogy between a vacuum tube and a neuron is not heavily used in the design, though it was of interest to von Neumann in his speculations about the computational power of the brain, and the EDVAC’s program was stored in fast memory, rather than on cards or tape as in Aiken’s Mark I, simply so that successive instructions could be accessed at electronic speeds.
“初稿”中关于神经元计算的讨论在“初步讨论”中消失了,但冯·诺依曼从未对大脑作为计算机的想法失去兴趣,并且在他因癌症去世时仍在研究它53. 《计算机与大脑》(冯·诺依曼,2000 年)在死后出版。
The discussion of neuronal computing in this “First draft” disappeared in the “Preliminary discussion,” but von Neumann never lost interest in the idea of the brain as a computer and was still working on it at the time of his death from cancer at age 53. The Computer and the Brain (von Neumann, 2000) was published posthumously.
控制该操作的指令必须以绝对详尽的细节提供给设备。它们包括解决所考虑的问题所需的所有数字信息:因变量的初始值和边界值、固定参数(常数)的值、问题陈述中出现的固定函数表。这些指令必须以设备可以感知的某种形式给出:打入打孔卡系统或电传打字带上,以磁性方式压印在钢带或钢丝上,以照相方式压印在电影胶片上,连接到一个或多个固定或可更换的插板中——这个列表并不一定完整。所有这些过程都需要使用一些代码来表达所考虑问题的逻辑和代数定义,以及必要的数字材料(参见上文)。
The instructions which govern this operation must be given to the device in absolutely exhaustive detail. They include all numerical information which is required to solve the problem under consideration: Initial and boundary values of the dependent variables, values of fixed parameters (constants), tables of fixed functions which occur in the statement of the problem. These instructions must be given in some form which the device can sense: Punched into a system of punchcards or on teletype tape, magnetically impressed on steel tape or wire, photographically impressed on motion picture film, wired into one or more fixed or exchangeable plugboards—this list being by no means necessarily complete. All these procedures require the use of some code to express the logical and the algebraical definition of the problem under consideration, as well as the necessary numerical material (cf. above).
一旦向设备发出这些指令,它必须能够完全执行它们,而无需进一步的智能人工干预。在所需操作结束时,设备必须以上述形式之一再次记录结果。结果是数值数据;它们是设备在执行上述指令的过程中产生的数字材料的指定部分。
Once these instructions are given to the device, it must be able to carry them out completely and without any need for further intelligent human intervention. At the end of the required operations the device must record the results again in one of the forms referred to above. The results are numerical data; they are a specified part of the numerical material produced by the device in the process of carrying out the instructions referred to above.
然而,在某种程度上甚至可以避免这些现象。该设备可以自动识别最常见的故障,通过外部可见的标志指示其存在和位置,然后停止。在某些条件下,它甚至可能自动执行必要的校正并继续(参见第10.3.3节)。
However, it may be possible to avoid even these phenomena to some extent. The device may recognize the most frequent malfunctions automatically, indicate their presence and location by externally visible signs, and then stop. Under certain conditions it might even carry out the necessary correction automatically and continue (cf. §10.3.3).
然而,必须指出的是,虽然这一原则本身可能是合理的,但其实现的具体方式需要仔细审查。即使是上面的运算列表:+、−、×、÷,也不是毫无疑问的。它可以扩展为包括诸如sgn、|之类的操作。|,还有 log 10、 log 2、ln、sin 及其倒数等。也可以考虑对其进行限制,例如省略 ÷ 甚至 ×。人们还可以考虑更具弹性的安排。对于某些操作,可以设想完全不同的过程,例如使用逐次逼近方法或函数表。……无论如何,设备的中央算术部分可能必须存在,这构成了第一个特定部分:CA。
It must be observed, however, that while this principle as such is probably sound, the specific way in which it is realized requires close scrutiny. Even the above list of operations: +, −, ×, ÷, is not beyond doubt. It may be extended to include such operation as sgn, | |, also log10, log2, ln, sin and their inverses, etc. One might also consider restricting it, e.g. omitting ÷ and even ×. One might also consider more elastic arrangements. For some operations radically different procedures are conceivable, e.g. using successive approximation methods or function tables. … At any rate a central arithmetical part of the device will probably have to exist, and this constitutes the first specific part: CA.
(a) 即使在进行乘法或除法的过程中,也必须记住一系列中间(部分)结果。这在较小程度上甚至适用于加法和减法(当进位数字可能必须在多个位置上进位时),并且在更大程度上适用于(如果需要这些操作)。……
(a) Even in the process of carrying out a multiplication or a division, a series of intermediate (partial) results must be remembered. This applies to a lesser extent even to additions and subtractions (when a carry digit may have to be carried over several positions), and to a greater extent to , if these operations are wanted. …
(b) 管理复杂问题的指令可能构成相当多的材料,特别是如果代码是间接的(在大多数安排中都是如此)。这个材料一定要记住。
(b) The instructions which govern a complicated problem may constitute a considerable material, particularly so, if the code is circumstantial (which it is in most arrangements). This material must be remembered.
(c) 在许多问题中,特定函数起着至关重要的作用。它们通常以表格的形式给出。事实上,在某些情况下,这是通过经验给出它们的方式(例如,许多流体动力学问题中物质的状态方程),在其他情况下,它们可以通过解析表达式给出,但无论如何,它可能更简单、更快从固定表格中获取它们的值,而不是在需要值时重新计算它们(基于分析定义)。通常,只有中等数量的条目(100-200)的表并使用插值法是很方便的。在大多数情况下,线性甚至二次插值是不够的,因此最好依靠三次或双二次(甚至更高阶)插值标准……。
(c) In many problems specific functions play an essential role. They are usually given in form of a table. Indeed in some cases this is the way in which they are given by experience (e.g. the equation of state of a substance in many hydrodynamical problems), in other cases they may be given by analytical expressions, but it may nevertheless be simpler and quicker to obtain their values from a fixed tabulation, than to compute them anew (on the basis of the analytical definition) whenever a value is required. It is usually convenient to have tables of a moderate number of entries only (100–200) and to use interpolation. Linear and even quadratic interpolation will not be sufficient in most cases, so it is best to count on a standard of cubic or biquadratic (or even higher order) interpolation ….
§ 10.2.2中提到的一些函数可以这样处理: log 10、 log 2、 ln 、 sin 及其反函数,也可能是。即使是倒数也可以以这种方式处理,从而将 ÷ 减少为 ×。
Some of the functions mentioned in the course of §10.2.2 may be handled in this way: log10, log2, ln, sin and their inverses, possibly also . Even the reciprocal might be treated in this manner, thereby reducing ÷ to ×.
(d) 对于偏微分方程,初始条件和边界条件可能构成广泛的数值材料,必须在整个给定问题中记住它们。
(d) For partial differential equations the initial conditions and the boundary conditions may constitute an extensive numerical material, which must be remembered throughout a given problem.
(e) 对于双曲型或抛物型偏微分方程,沿变量t积分,必须记住属于周期t的(中间)结果,以计算周期t + dt。这种材料大部分属于 (d) 类型,只是它不是由人类操作员放入设备中,而是由设备本身产生(并且可能随后再次被删除并由 t + dt 的相应数据替换),在其自动运行过程。
(e) For partial differential equations of the hyperbolic or parabolic type, integrated along a variable t, the (intermediate) results belonging to the cycle t must be remembered for the calculation of the cycle t + dt. This material is much of the type (d), except that it is not put into the device by human operators, but produced (and probably subsequently again removed and replaced by the corresponding data for t + dt) by the device itself, in the course of its automatic operation.
(f) 对于全微分方程(d)、(e)也适用,但它们需要较小的存储容量。在依赖于给定常量、固定参数等的问题中,需要类型 (d) 的进一步内存需求。
(f) For total differential equations (d), (e) apply too, but they require smaller memory capacities. Further memory requirements of the type (d) are required in problems which depend on given constants, fixed parameters, etc.
(g) 通过逐次逼近解决的问题(例如椭圆型偏微分方程,用松弛方法处理)需要类型(e)的存储器:必须记住每个逼近的(中间)结果,而正在计算下一个。
(g) Problems which are solved by successive approximations (e.g. partial differential equations of the elliptic type, treated by relaxation methods) require a memory of the type (e): The (intermediate) results of each approximation must be remembered, while those of the next one are being computed.
(h) 分类问题和某些统计实验(非常高速的设备为此提供了一个有趣的机会)需要对正在处理的材料进行记忆。
(h) Sorting problems and certain statistical experiments (for which a very high speed device offers an interesting opportunity) require a memory for the material which is being treated.
无论如何,总内存构成了设备的第三个特定部分:M。
At any rate the total memory constitutes the third specific part of the device: M.
换句话说:设备的 C 部分和 M 部分之间的所有数字(或其他)信息传输都必须通过这些部分中包含的机制来实现。然而,仍然需要将原始确定信息从外部获取到设备中,以及将最终信息、结果从设备获取到外部。
In other words: All transfers of numerical (or other) information between the parts C and M of the device must be effected by the mechanisms contained in these parts. There remains, however, the necessity of getting the original definitory information from outside into the device, and also of getting the final information, the results, from the device into the outside.
我们所说的外部是指第10.1.2节中描述的类型的介质:这里的信息或多或少可以直接通过人类行为产生(打字、打孔、拍摄由相同类型的按键产生的光脉冲、磁化金属带或电线)某种类似的方式等),它可以被静态存储,最终或多或少地被人体器官直接感知。
By the outside we mean media of the type described in §10.1.2: Here information can be produced more or less directly by human action (typing, punching, photographing light impulses produced by keys of the same type, magnetizing metal tape or wire in some analogous manner, etc.), it can be statically stored, and finally sensed more or less directly by human organs.
该设备必须能够保持输入和输出(感觉和运动)与此类特定介质的接触(参见§ 10.1.2):该介质将被称为设备的外部记录介质: R . 现在我们有:
The device must be endowed with the ability to maintain the input and output (sensory and motor) contact with some specific medium of this type (cf. §10.1.2): That medium will be called the outside recording medium of the device: R. Now we have:
对第10.2.4节 (a)–(h)中列举的 M 的典型函数的检查表明: 移位 (a)(进行算术运算时所需的短时内存)会很方便out)在设备外部,即从M进入R。(实际上(a)将在设备内部,但在 CA 中而不是在 M 中……)所有现有设备,甚至现有的台式计算机,此时都使用 M 的等效项。然而(b)(逻辑指令)可以从外部感测,即通过来自R的I感测,(c)(函数表)和(e)、(g)(中间结果)也是如此。后者可以在设备产生它们时由 O 传送到 R,并在需要它们时由 I 从 R 感测到。(d)(初始条件和参数)甚至可能(f)(全微分方程的中间结果)在某种程度上也是如此。至于(h)(排序和统计),情况有些模棱两可:在许多情况下,使用M的可能性会决定性地加速事情的发展,但是将M的使用与R的更广泛的使用适当混合可能是可行的,而不会造成严重损失速度并显着增加可处理的材料量。
Inspection of the typical functions of M, as enumerated in §10.2.4, (a)–(h), shows this: It would be convenient to shift (a) (the short-duration memory required while an arithmetical operation is being carried out) outside the device, i.e. from M into R. (Actually (a) will be inside the device, but in CA rather than in M. …) All existing devices, even the existing desk computing machines, use the equivalent of M at this point. However (b) (logical instructions) might be sensed from outside, i.e. by I from R, and the same goes for (c) (function tables) and (e), (g) (intermediate results). The latter may be conveyed by O to R when the device produces them, and sensed by I from R when it needs them. The same is true to some extent of (d) (initial conditions and parameters) and possibly even of (f) (intermediate results from a total differential equation). As to (h) (sorting and statistics), the situation is somewhat ambiguous: In many cases the possibility of using M accelerates matters decisively, but suitable blending of the use of M with a longer range use of R may be feasible without serious loss of speed and increase the amount of material that can be handled considerably.
事实上,所有现有的(全自动或部分自动)计算设备都使用 R(作为一堆打孔卡或一段电传打字带)来实现所有这些目的(除了 (a),如上所述)。然而,看起来真正的高速设备的用途将非常有限,除非它可以依靠 M,而不是 R,来实现第 10.2.4 节(a)-(h)中列举的所有目的,并且具有某些(e)、(g)、(h) …情况下的限制。
Indeed, all existing (fully or partially automatic) computing devices use R—as a stack of punchcards or a length of teletype tape—for all these purposes (excepting (a), as pointed out above). Nevertheless it will appear that a really high speed device would be very limited in its usefulness unless it can rely on M, rather than on R, for all the purposes enumerated in §10.2.4, (a)–(h), with certain limitations in the case of (e), (g), (h) ….
在这样的讨论过程中,所需的特征和似乎最适合保护它们的布置将逐渐具体化,直到设备及其控制呈现出相当明确的形状。正如之前所强调的,这适用于物理设备以及控制其功能的算术和逻辑安排。
In the course of such a discussion the desired features and the arrangements which seem best suited to secure them will crystallize gradually until the device and its control assume a fairly definite shape. As emphasized before, this applies to the physical device as well as to the arithmetical and logical arrangements which govern its functioning.
每个数字计算设备都包含某些具有离散平衡的类似继电器的元件。这样的元素具有两种或多种不同的状态,在这些状态中它可以无限期地存在。这些可能是完美的平衡,在每种平衡中,元素将在没有任何外部支持的情况下保持不变,而适当的外部刺激会将其从一种平衡转移到另一种平衡。或者,可能有两种状态,其中一种是在没有外部支持时存在的平衡,而另一种则取决于外部刺激的存在。中继动作表现为每当元件本身接收到上述类型的刺激时元件就会忽略刺激。发出的刺激必须与接收到的刺激属于同一类型,也就是说,它们必须能够刺激其他元素。然而,接收到的刺激和发出的刺激之间一定不存在能量关系,也就是说,接收到一个刺激的元件必须能够发出多个相同强度的刺激。换句话说:作为一个继电器,该元件必须从另一个来源接收能量供应,而不是输入刺激。
Every digital computing device contains certain relay like elements, with discrete equilibria. Such an element has two or more distinct states in which it can exist indefinitely. These may be perfect equilibria, in each of which the element will remain without any outside support, while appropriate outside stimuli will transfer it from one equilibrium into another. Or, alternatively, there may be two states, one of which is an equilibrium which exists when there is no outside support, while the other depends for its existence upon the presence of an outside stimulus. The relay action manifests itself in the omission of stimuli by the element whenever it has itself received a stimulus of the type indicated above. The emitted stimuli must be of the same kind as the received one, that is, they must be able to stimulate other elements. There must, however, be no energy relation between the received and the emitted stimuli, that is, an element which has received one stimulus, must be able to emit several of the same intensity. In other words: Being a relay, the element must receive its energy supply from another source than the incoming stimulus.
在现有的数字计算设备中,各种机械或电气设备已被用作元件: 轮子,可以锁定到十个(或更多)重要位置中的任何一个,并且在从一个位置移动到另一个位置时发射可能导致其他位置的电脉冲。类似的轮子移动;由电磁体驱动并打开或关闭电路的单个或组合电报继电器;这两个元素的组合;——最后,存在使用真空管的合理且诱人的可能性,栅极充当阴极板电路的阀门。在最后提到的情况下,栅极也可能被偏转装置所取代,即真空管被阴极射线管取代——但很可能在未来一段时间内,真空管的更大可用性和各种电气优势将使第一个程序在前台运行。
In existing digital computing devices various mechanical or electrical devices have been used as elements: Wheels, which can be locked into any one of ten (or more) significant positions, and which on moving from one position to another transmit electric pulses that may cause other similar wheels to move; single or combined telegraph relays, actuated by an electromagnet and opening or closing electric circuits; combinations of these two elements;—and finally there exists the plausible and tempting possibility of using vacuum tubes, the grid acting as a valve for the cathode-plate circuit. In the last mentioned case the grid may also be replaced by deflecting organs, i.e. the vacuum tube by a cathode ray tube—but it is likely that for some time to come the greater availability and various electrical advantages of the vacuum tubes proper will keep the first procedure in the foreground.
任何这样的设备都可以通过其元件的连续反应时间来自主计时。在这种情况下,所有刺激最终都必须源自输入。或者,他们的计时可能受到固定时钟的影响,固定时钟提供了其在确定的周期性重复时刻发挥作用所必需的某些刺激。该时钟可以是机械或混合机电装置中的旋转轴;它可能是纯电气设备中的电振荡器(可能是晶体控制的)。如果依赖于设备同时执行的几个不同操作序列的同步性,则时钟印象定时显然是更可取的。我们将在上述定义的技术意义上使用术语“元素” ,并根据其时序是由时钟还是自主来调用设备,并将其称为同步或异步,如上所述。
Any such device may time itself autonomously, by the successive reaction times of its elements. In this case all stimuli must ultimately originate in the input. Alternatively, they may have their timing impressed by a fixed clock, which provides certain stimuli that are necessary for its functioning at definite periodically recurrent moments. This clock may be a rotating axis in a mechanical or a mixed, mechanico-electrical device; and it may be an electrical oscillator (possibly crystal controlled) in a purely electrical device. If reliance is to be placed on synchronisms of several distinct sequences of operations performed simultaneously by the device, the clock impressed timing is obviously preferable. We will use the term element in the above defined technical sense, and call the device synchronous or asynchronous, according to whether its timing is impressed by a clock or autonomous, as described above.
遵循 McCulloch 和 Pitts(1943 年,此处第 9 章),我们忽略了神经元功能的更复杂的方面:阈值、时间求和、相对抑制、突触延迟之外的刺激后效应引起的阈值变化等。然而,它是,方便偶尔考虑具有固定阈值 2 和 3 的神经元,即只能通过 2 或 3 个兴奋性突触上的(同时)刺激(而抑制性突触上没有)刺激的神经元。……
Following McCulloch and Pitts (1943, here chapter 9) we ignore the more complicated aspects of neuron functioning: Thresholds, temporal summation, relative inhibition, changes of the threshold by after-effects of stimulation beyond the synaptic delay, etc. It is, however, convenient to consider occasionally neurons with fixed thresholds 2 and 3, that is, neurons which can be excited only by (simultaneous) stimuli on 2 or 3 excitatory synapses (and none on an inhibitory synapse). …
很容易看出,这些简化的神经元功能可以通过电报继电器或真空管来模拟。尽管神经系统可能是异步的(对于突触延迟),但可以通过使用同步设置获得精确的突触延迟。……
It is easily seen that these simplified neuron functions can be imitated by telegraph relays or by vacuum tubes. Although the nervous system is presumably asynchronous (for the synaptic delays), precise synaptic delays can be obtained by using synchronous setups. …
在接下来的考虑中,我们将相应地假设该设备具有真空管作为元件。我们还将尝试在所使用的管子类型是传统的和市售的管子的基础上,对所涉及的管子数量、时间等进行所有估计。也就是说,不使用异常复杂或具有全新功能的管子。在对传统类型(或一些等效元素……)进行彻底分析后,使用新型管的可能性实际上会变得更加清晰和明确。
In the considerations which follow we will assume accordingly, that the device has vacuum tubes as elements. We will also try to make all estimates of numbers of tubes involved, timing, etc., on the basis that the types of tubes used are the conventional and commercially available ones. That is, that no tubes of unusual complexity or with fundamentally new functions are to be used. The possibilities for the use of new types of tubes will actually become clearer and more definite after a thorough analysis with the conventional types (or some equivalent elements …) has been carried out.
最后,同步设备似乎具有相当大的优势……。
Finally, it will appear that a synchronous device has considerable advantages ….
§ 10.4.3意义上的元件,用作电流阀或门的真空管,是一种全有或全无的装置,或者至少它近似于一种:根据栅极偏压是否高于或低于截止值,它是否会通过电流。确实,它的所有电极上都需要确定的电势才能维持任一状态,但是真空管的组合具有完美的平衡:每种组合都可以无限期地存在几种状态,无需任何外部支持,只要适当外部刺激(电脉冲)会将其从一种平衡转移到另一种平衡。这些就是所谓的触发电路,基本电路有两个平衡点并包含两个三极管或一个五极管。具有两个以上平衡点的触发电路所涉及的程度要高得多。
The element in the sense of §10.4.3, the vacuum tube used as a current valve or gate, is an all-or-none device, or at least it approximates one: According to whether the grid bias is above or below cut-off, it will pass current or not. It is true that it needs definite potentials on all its electrodes in order to maintain either state, but there are combinations of vacuum tubes which have perfect equilibria: Several states in each of which the combination can exist indefinitely, without any outside support, while appropriate outside stimuli (electric pulses) will transfer it from one equilibrium into another. These are the so called trigger circuits, the basic one having two equilibria and containing two triodes or one pentode. The trigger circuits with more than two equilibria are disproportionately more involved.
因此,无论管子用作门还是触发器,全有或全无的两种平衡布置都是最简单的布置。由于这些管装置是通过数字来处理数字的,因此很自然地使用数字也是二值的算术系统。这表明使用二进制系统。
Thus, whether the tubes are used as gates or as triggers, the all-or-none, two equilibrium, arrangements are the simplest ones. Since these tube arrangements are to handle numbers by means of their digits, it is natural to use a system of arithmetic in which the digits are also two valued. This suggests the use of the binary system.
§§ 10.4.2 – 10.4.3中讨论的人类神经元的类似物同样是全有或全无元素。看起来它们对于真空管系统的所有初步、定向、考虑非常有用(参见§§ 10.6.1 – 10.6.2)。因此,令人满意的是,这里要处理的自然算术系统也是二进制系统。
The analogs of human neurons, discussed in §§10.4.2–10.4.3, are equally all-or-none elements. It will appear that they are quite useful for all preliminary, orienting, considerations of vacuum tube systems (cf. §§10.6.1–10.6.2). It is therefore satisfactory that here too the natural arithmetical system to handle is the binary one.
当然,必须记住,人类直接使用的数字材料很可能必须以十进制表示。因此,R 中使用的符号应该是十进制。但无论如何,最好在 CA 中以及在可能进入中央控制 CC 的任何数字材料中使用严格的二进制程序。因此 M 应该只存储二进制材料。
It must be remembered, of course, that the numerical material which is directly in human use, is likely to have to be expressed in the decimal system. Hence, the notations used in R should be decimal. But it is nevertheless preferable to use strictly binary procedures in CA, and also in whatever numerical material may enter into the central control CC. Hence M should store binary material only.
这就需要将十进制-二进制和二进制-十进制转换设施合并到 I 和 O 中。由于这些转换需要大量算术操作,因此使用 CA 是最经济的,因此为了协调目的,也使用 CC,与 I 和 O 相关然而,CA 的使用意味着两种转换中使用的所有算术都必须严格是二进制的。……
This necessitates incorporating decimal-binary and binary-decimal conversion facilities into I and O. Since these conversions require a good deal of arithmetical manipulating, it is most economical to use CA, and hence for coordinating purposes also CC, in connection with I and O. The use of CA implies, however, that all arithmetics used in both conversions must be strictly binary. …
很自然地观察到,在十进制系统中,相当少的步数可以获得 8 2 = 64 步,可能加倍,即大约 100 步。然而,如此低的数字是以使用乘法表或以其他方式增加或复杂化设备的代价购买的。以这个价格,也可以通过更直接的二进制技巧来缩短该过程,这将在目前考虑。因此,似乎没有必要单独讨论十进制过程。
It is natural to observe that in the decimal system a considerably smaller number of steps obtains 82 = 64 steps, possibly doubled, that is about 100 steps. However, this low number is purchased at the price of using a multiplication table or otherwise increasing or complicating the equipment. At this price the procedure can be shortened by more direct binary artifices, too, which will be considered presently. For this reason it seems not necessary to discuss the decimal procedure separately.
避免这些长时间持续的逻辑过程包括伸缩操作,即同时执行尽可能多的操作。由于携带的复杂性,甚至连加法或减法这样简单的运算也无法立即执行。在除法中,除非已知其左边的所有数字,否则数字的计算甚至无法开始。然而,相当多的同时化是可能的:除了加法或减法之外,所有对应数字对可以立即组合,所有第一个进位数字可以在下一步中一起应用,等等。在乘法中,所有部分积的形式为(被乘数)×(乘数位)可以同时形成和定位——在二进制系统中,这样的部分积为零或被乘数,因此这只是定位问题。在加法和乘法中,可以使用上述加速形式的加法和减法。此外,在乘法中,通过将第一对与第二对、第三对等同时相加,可以快速对部分积求和;然后将第一对对和与第二对、第三对和等同时相加;依此类推,直到收集完所有条款。(由于 27 ≤ 2 5,这允许在 5 次加法中收集 27 个部分和(假设有 27 个二进制数字乘数)。该方案由 H. Aiken 提出。)
The logical procedure to avoid these long durations, consists of telescoping operations, that is of carrying out simultaneously as many as possible. The complexities of carrying prevent even such simple operations as addition or subtraction to be carried out at once. In division the calculation of a digit cannot even begin unless all digits to its left are already known. Nevertheless considerable simultaneisations are possible: In addition or subtraction all pairs of corresponding digits can be combined at once, all first carry digits can be applied together in the next step, etc. In multiplication all the partial products of the form (multiplicand) × (multiplier digit) can be formed and positioned simultaneously—in the binary system such a partial product is zero or the multiplicand, hence this is only a matter of positioning. In both addition and multiplication the above mentioned accelerated forms of addition and subtraction can be used. Also, in multiplication the partial products can be summed up quickly by adding the first pair together simultaneously with the second pair, the third pair, etc.; then adding the first pair of pair sums together simultaneously with the second one, the third one, etc.; and so on until all terms are collected. (Since 27 ≤ 25, this allows to collect 27 partial sums—assuming a 27 binary digit multiplier—in 5 addition times. This scheme is due to H. Aiken.)
这种加速、伸缩程序正在所有现有设备中使用。(使用十进制系统,无论有或没有进一步的伸缩技巧,也是这种类型,如第10.5.3节末尾所指出的。它实际上比纯粹的二进过程效率要低一些。第10.5 节的论点。 1 – 10.5.2反对在这里考虑它。)但是,它们仅以精确地增加必要设备的速率(即设备中的元件数量)节省时间:显然,如果通过系统地执行来将持续时间减半二一次添加,需要双重添加设备(即使假设可以在没有不成比例的控制设施的情况下使用并且完全有效)等。
Such accelerating, telescoping procedures are being used in all existing devices. (The use of the decimal system, with or without further telescoping artifices is also of this type, as pointed out at the end of §10.5.3. It is actually somewhat less efficient than purely dyadic procedures. The arguments of §§10.5.1–10.5.2 speak against considering it here.) However, they save time only at exactly the rate at which they multiply the necessary equipment, that is the number of elements in the device: Clearly if a duration is halved by systematically carrying out two additions at once, double adding equipment will be required (even assuming that it can be used without disproportionate control facilities and fully efficiently), etc.
这种通过增加设备来赢得时间的方式在非真空管元件设备中是完全合理的,在非真空管元件设备中,赢得时间至关重要,并且在处理包含许多元件的相关设备方面可以获得丰富的工程经验。根据所有可用的经验,按照这些思路构建的真正通用的自动数字计算系统必须包含超过 10,000 个元素。
This way of gaining time by increasing equipment is fully justified in non vacuum tube element devices, where gaining time is of the essence, and extensive engineering experience is available regarding the handling of involved devices containing many elements. A really all-purpose automatic digital computing system constructed along these lines must, according to all available experience, contain over 10,000 elements.
正如第10.4.3节所指出的,不太复杂的真空管装置的反应时间可以缩短至一微秒。现在,按照这个速率,即使是第10.5.3节中获得的未操纵的乘法持续时间也是可以接受的:1000-1500 个反应时间相当于 1-1.5 毫秒,这比任何可以想象的非真空管设备快得多,以至于它实际上产生保持设备平衡的一个严重问题,即保持其输入和输出端之外的必要的人类监督与其操作同步。……
As pointed out in §10.4.3, the reaction time of a not too complicated vacuum tube device can be made as short as one microsecond. Now at this rate even the unmanipulated duration of the multiplication, obtained in §10.5.3 is acceptable: 1000–1500 reaction times amount to 1–1.5 milliseconds, and this is so much faster than any conceivable non vacuum tube device, that it actually produces a serious problem of keeping the device balanced, that is to keep the necessarily human supervision beyond its input and output ends in step with its operations. …
关于其他算术运算,可以这样说:加法和减法显然比乘法快得多。以 27 个二进制数字为基础(参见第10.5.3节)并考虑进位,每个数字最多需要 27 个步骤的两倍,即大约 30-50 个步骤或反应时间。这相当于 0.03–0.05 毫秒。在该方案中,在乘法中没有尝试捷径和伸缩并且使用二进制系统,除法所花费的步数与乘法大约相同。……平方根通常(在这个方案中也是如此)本质上并不比除法长。
Regarding other arithmetical operations this can be said: Addition and subtraction are clearly much faster than multiplication. On a basis of 27 binary digits (cf. §10.5.3) and taking carrying into consideration, each should take at most twice 27 steps, that is about 30–50 steps or reaction times. This amounts to.03–.05 milliseconds. Division takes, in this scheme where shortcuts and telescoping have not been attempted in multiplying and the binary system is being used, about the same number of steps as multiplication. … Square rooting is usually, and in this scheme too, not essentially longer than dividing.
因此,似乎值得考虑以下观点:设备应该尽可能简单,即包含尽可能少的元素。这可以通过从不同时执行两个操作来实现,如果这会导致所需元素数量的显着增加。结果将是该设备将工作得更可靠,并且真空管可以被驱动到比其他方式更短的反应时间。
Thus it seems worthwhile to consider the following viewpoint: The device should be as simple as possible, that is, contain as few elements as possible. This can be achieved by never performing two operations simultaneously, if this would cause a significant increase in the number of elements required. The result will be that the device will work more reliably and the vacuum tubes can be driven to shorter reaction times than otherwise.
还值得强调的是,到目前为止,所有关于高速数字计算设备的思考都倾向于相反的方向:以倍增所需元件数量为代价,通过伸缩过程来加速。因此,尝试尽可能完整地思考相反的观点似乎更具启发性:绝对避免上述程序,即始终如一地执行第 10.5.6 节中阐述的原则。
It is also worth emphasizing that up to now all thinking about high speed digital computing devices has tended in the opposite direction: Towards acceleration by telescoping processes at the price of multiplying the number of elements required. It would therefore seem to be more instructive to try to think out as completely as possible the opposite viewpoint: That of absolutely refraining from the procedure mentioned above, that is of carrying out consistently the principle formulated in §10.5.6.
因此,我们将朝这个方向前进。……
We will therefore proceed in this direction. …
为了做到这一点,有必要使用一些示意图来了解设备标准元件的功能:实际上,有关设备的算术和逻辑控制程序及其其他功能的决策只能是是基于对元素功能的一些假设而做出的。
In order to do this it is necessary to use some schematic picture for the functioning of the standard element of the device: Indeed, the decisions regarding the arithmetical and the logical control procedures of the device, as well as its other functions, can only be made on the basis of some assumptions about the functioning of the elements.
理想的程序是将这些元件按照其应有的样子处理:作为真空管。然而,这需要在讨论的早期阶段对特定的无线电工程问题进行详细分析,因为仍有太多的替代方案需要进行详尽而详细的处理。此外,从实际性能等角度来看,安排算术程序、逻辑控制等的众多替代可能性将叠加在选择真空管和其他电路元件的类型和尺寸的同样众多的可能性上。这将产生一种复杂和不透明的局面,在这种情况下,我们现在正在尝试的初步方向几乎不可能实现。
The ideal procedure would be to treat the elements as what they are intended to be: as vacuum tubes. However, this would necessitate a detailed analysis of specific radio engineering questions at this early stage of the discussion, when too many alternatives are still open to be treated all exhaustively and in detail. Also, the numerous alternative possibilities for arranging arithmetical procedures, logical control, etc., would superpose on the equally numerous possibilities for the choice of types and sizes of vacuum tubes and other circuit elements from the point of view of practical performance, etc. All this would produce an involved and opaque situation in which the preliminary orientation which we are now attempting would be hardly possible.
为了避免这种情况,我们将基于一个假设的元件进行考虑,该元件的功能本质上类似于真空管,例如具有适当关联 RLC 电路的三极管,但可以将其作为一个孤立的实体进行讨论,而无需详细讨论无线电频率电磁考虑。我们再次强调:这种简化只是暂时的,只是一个暂时的立场,以使目前的初步讨论成为可能。在初步讨论得出结论后,必须重新考虑这些元素的真实电磁性质。但届时初步讨论的决定将会公布,相应的替代方案也会相应地被消除。
In order to avoid this we will base our considerations on a hypothetical element, which functions essentially like a vacuum tube—e.g. like a triode with an appropriate associated RLC-circuit—but which can be discussed as an isolated entity, without going into detailed radio frequency electromagnetic considerations. We re-emphasize: This simplification is only temporary, only a transient standpoint, to make the present preliminary discussion possible. After the conclusions of the preliminary discussion the elements will have to be reconsidered in their true electromagnetic nature. But at that time the decisions of the preliminary discussion will be available, and the corresponding alternatives accordingly eliminated.
我们将讨论的元素,称为E 元素,将被表示为一个圆 O,它接收兴奋性和抑制性刺激,并沿着与其相连的线发出自己的刺激:该轴可能分支:
。沿着它的发射通过突触延迟跟随原始刺激,我们可以假设这是一个固定时间,对于所有 E 元素都相同,用t表示。我们建议忽略除t之外的其他延迟(由于沿线刺激的传导)。我们将通过线上的箭头标记延迟t
的存在: 。这也将有助于识别线的原点和方向。
The element which we will discuss, to be called an E-element, will be represented to be a circle O, which receives the excitatory and inhibitory stimuli, and emits its own stimuli along a line attached to it: This axis may branch: . The emission along it follows the original stimulation by a synaptic delay, which we can assume to be a fixed time, the same for all E-elements, to be denoted by t. We propose to neglect the other delays (due to conduction of the stimuli along the lines) aside of t. We will mark the presence of the delay t by an arrow on the line: . This will also serve to identify the origin and the direction of the line.
人类神经系统和我们的预期应用之间的另一个本质分歧在于我们使用了所有 E 元素共有的明确定义的无分散突触延迟t 。(重点是排除色散。我们实际上将使用具有突触延迟 2 t的 E 元件,……)我们建议使用延迟t作为绝对时间单位,可以依靠它来同步各种功能设备的部件。这种安排的优点是显而易见的,具体的技术原因将出现在[编辑:句子在这里结束]中。
Another point of essential divergence between the human nervous system and our intended application consists in our use of a well-defined dispersionless synaptic delay t, common to all E-elements. (The emphasis is on the exclusion of a dispersion. We will actually use E-elements with a synaptic delay 2t, …) We propose to use the delays t as absolute units of time which can be relied upon to synchronize the functions of various parts of the device. The advantages of such an arrangement are immediately plausible, specific technical reasons will appear in [EDITOR: sentence ends here].
为了实现这一点,有必要将设备设想为第10.4.1节意义上的同步。中央时钟最好被认为是一个电振荡器,它在每个周期t中发出一个长度为t '的短标准脉冲。E 元件名义上发出的刺激实际上是时钟的脉冲,脉冲充当时钟的门。显然,对于必须保持大门打开以通过的时间段,存在着广泛的容忍度。无失真的时钟脉冲。比照。图10.2。因此,门的打开可以由具有平均延迟时间t但有相当大的允许偏差的任何电延迟装置来控制。尽管如此,有效的突触延迟将完全符合时钟的精度,并且在每一步之后刺激都会完全更新和同步。……
In order to achieve this, it is necessary to conceive the device as synchronous in the sense of §10.4.1. The central clock is best thought of as an electrical oscillator, which emits in every period t a short, standard pulse of a length t′ of about . The stimuli emitted nominally by an E-element are actually pulses of the clock, for which the pulse acts as a gate. There is clearly a wide tolerance for the period during which the gate must be kept open, to pass the clock-pulse without distortion. Cf. Figure 10.2. Thus the opening of the gate can be controlled by any electric delay device with a mean delay time t, but considerable permissible dispersion. Nevertheless, the effective synaptic delay will be t with the full precision of the clock, and the stimulus is completely renewed and synchronized after each step. …
图 10.2: 时钟脉冲
Figure 10.2: Clock pulses
存储器的(容量)单位是保留一位二进制数字的值的能力。
The (capacity) unit of memory is the ability to retain the value of one binary digit.
我们现在可以表达这些记忆单元中各类信息的“成本”。
We can now express the “cost” of various types of information in these memory units.
让我们首先考虑存储标准(实)数所需的内存容量。如所示……,我们将把这个数字的大小固定为 30 位二进制数字(至少对于大多数用途……)。这使相对舍入误差保持在2 -30以下,其对应于10 -9,即携带9位有效小数位。因此,一个标准数量对应于30个存储单元。必须为其符号添加一个单位……并且建议添加另一个单位来代替将其表征为数字的符号(以将其与订单区分开,参见第10.14节)。这样我们就得到每个数字32 = 2 6个单位。
Let us consider first the memory capacity required to store a standard (real) number. As indicated …, we shall fix the size of such a number at 30 binary digits (at least for most uses …). This keeps the relative rounding-off errors below 2−30, which corresponds to 10−9, i.e. to carrying 9 significant decimal digits. Thus a standard number corresponds to 30 memory units. To this must be added one unit for its sign … and it is advisable to add a further unit in lieu of a symbol which characterizes it as a number (to distinguish it from an order, cf. §10.14). In this way we arrive at 32 = 26 units per number.
事实上,一个数字需要 32 个内存单元,因此建议以这种方式细分整个内存:首先,显然,细分为单元,其次细分为 32 个单元的组,称为小循环。……每个标准(实数)数相应地恰好占据一个小周期。如果存储器的所有其他常量也适合这种细分为小周期,则它简化了整个存储器的组织以及设备的各种同步问题。……
The fact that a number requires 32 memory units, makes it advisable to subdivide the entire memory in this way: First, obviously, into units, second into groups of 32 units, to be called minor cycles. … Each standard (real) number accordingly occupies precisely one minor cycle. It simplifies the organization of the entire memory, and various synchronization problems of the device along with it, if all other constants of the memory are also made to fit into this subdivision into minor cycles. …
在我们制定这个代码之前,我们必须对 CC 的功能及其与 M 的关系进行一些一般性的考虑。
Before we can formulate this code, we must go through some general considerations concerning the functions of CC and its relation to M.
CC 收到的订单来自M,即来自存储数字材料的同一位置。(参见§ 10.2.4 … .) M 的内容由小循环…组成,因此根据上述,每个小循环必须包含一个区别标记,表明它是标准数还是顺序。
The orders which are received by CC come from M, i.e. from the same place where the numerical material is stored. (Cf. §10.2.4 ….) The content of M consists of minor cycles …, hence by the above each minor cycle must contain a distinguishing mark which indicates whether it is a standard number or an order.
CC 收到的命令自然分为以下四类: (a) CC 指示 CA 执行其十项具体操作之一的命令……;(b) 命令CC将标准号码从一地转移到另一地;(c)命令CC将其与M的连接转移到M中的不同点,以便从那里获得下一个命令;(d) 控制设备输入和输出操作的命令(即第10.2.7节中的 I 和第10.2.8节中的 O )。
The orders which CC receives fall naturally into these four classes: (a) orders for CC to instruct CA to carry out one of its ten specific operations …; (b) orders for CC to cause the transfer of a standard number from one place to another; (c) orders for CC to transfer its own connection with M to a different point in M, with the purpose of getting its next order from there; (d) orders controlling the operation of the input and the output of the device (i.e. I of §10.2.7 and O of §10.2.8).
现在让我们分别考虑这些类别 (a)–(d)。我们目前无法在……有关 (a) 的陈述中添加任何内容……。(d) 的讨论也最好推迟一下……。然而,我们建议现在讨论(b)和(c)。
Let us now consider these classes (a)–(d) separately. We cannot at this time add anything to the statements … concerning (a) …. The discussion of (d) is also better delayed …. We propose, however, to discuss (b) and (c) now.
除此之外,这样的转移指令可能规定,在期望的小周期中接收并执行该指令后,CC应将其与包含转移指令的小周期之后的小周期的DLA机构返回其连接,等待直到这个小循环出现在输出处,然后继续以自然时间顺序接受从那里开始的订单。或者,在所需的小周期中接收并执行命令后,CC 应继续该连接,并在自然的时间内接受来自该连接的命令。时间顺序。为了方便起见,将第一种类型的传输称为瞬时传输,将第二种类型的传输称为永久传输。
Apart from this, such a transfer order might provide that, after receiving and carrying out the order in the desired minor cycle, CC should return its connection to the DLA organ which contains the minor cycle that follows upon the one containing the transfer order, wait until this minor cycle appears at the output, and then continue to accept orders from there on in the natural temporal sequence. Alternatively, after receiving and carrying out the order in the desired minor cycle, CC should continue with that connection, and accept orders from there on in the natural temporal sequence. It is convenient to call a transfer of the first type a transient one, and one of the second type a permanent one.
显然,永久转移是经常需要的,因此第二种类型当然是必要的。毫无疑问,在传输标准号码时需要瞬时传输(命令 (c′) 和 (c′′),参见 § 10.14.2 …的末尾)。真正的订单中是否需要它们似乎非常值得怀疑,特别是因为此类订单仅构成 M …内容的一小部分,并且临时转移订单总是可以由两个永久转移订单来表示。因此,我们将使所有转移永久化,但与转移标准号码相关的转移除外,如上所述。……
It is clear that permanent transfers are frequently needed, hence the second type is certainly necessary. Transient transfers are undoubtedly required in connection with transferring standard numbers (orders (c′) and (c′′), cf. the end of §10.14.2 …). It seems very doubtful whether they are ever needed in true orders, particularly since such orders constitute only a small part of the contents of M …, and a transient transfer order can always be expressed by two permanent transfer orders. We will therefore make all transfers permanent, except those connected with transferring standard numbers, as indicated above. …
因此,让我们重申相关的定义和析取。
Let us therefore restate the pertinent definitions and disjunctions.
M 的内容是记忆单元,每个记忆单元都以刺激的存在或不存在为特征。它可以用来相应地表示二进制数字1或0,并且我们无论如何将通过它以这种方式对应的二进制数字i = 1或0来指定其内容。…这些单元组合在一起形成 32 个单元的小循环,这些小循环是在我们将介绍的代码中具有直接意义的实体。…我们用i 0 , i 1 , i 2 , … , i 31表示组成小周期 32 个单位的二进制数字,按其自然时间顺序。这些单位的小循环可以写成I = ( i 0 , i 1 , i 2 , … , i 31 ) = ( i v )。
The contents of M are the memory units, each one being characterized by the presence or absence of a stimulus. It can be used to represent accordingly the binary digit 1 or 0, and we will at any rate designate its content by the binary digit i = 1 or 0 to which it corresponds in this manner. …These units are grouped together to form 32-unit minor cycles, and these minor cycles are the entities which will acquire direct significance in the code which we will introduce. … We denote the binary digits which make up the 32 units of a minor cycle, in their natural temporal sequence, by i0, i1, i2, …, i31. The minor cycles with these units may be written I = (i0, i1, i2, …, i31) = (iv).
小周期分为两类:标准数和阶次。(参见… § 10.14.1 。)这两个类别应该通过它们各自的第一单位…即i 0的值来相互区分。据此我们同意i 0 = 0 是指定一个标准数,i 0 = 1 是一个顺序。
Minor cycles fall into two classes: Standard numbers and orders. (Cf. …§10.14.1.) These two categories should be distinguished from each other by their respective first units … i.e. by the value of i0. We agree accordingly that i0 = 0 is to designate a standard number, and i0 = 1 an order.
图 10.3: EDVAC“命令”初稿(说明)
Figure 10.3: First draft EDVAC “orders” (instructions)
经电气和电子工程师协会许可,转载自冯·诺依曼 (1993)。
Reprinted from von Neumann (1993), with permission from the Institute for Electrical and Electronics Engineers.
在麻省理工学院教授工程学期间,Vannevar Bush(1890-1974)建造了一台称为微分分析仪的模拟计算机,用于求解微分方程。以现代标准来看,这是一个笨拙的设备,大部分是机械设备,只有一些电气元件,但它确实有效。1936 年,布什聘请克劳德·香农 (Claude Shannon) 作为研究助理来运行该设备,这次经历让香农系统地思考了开关电路,并促成了他著名的硕士论文(第 8 章)。布什转向考虑信息存储和检索。受最近扩大的缩微胶卷商业用途(《纽约时报》于 1935 年开始以缩微形式出版)的影响,布什将他在 1945 年这篇预言性文章中描述的设备想象为模因转换器。微电影也同样启发了科幻作家 HG Wells (1938)。他写道:“现在,为所有人类知识、思想和成就创建一个有效的索引,即为全人类创建一个完整的地球记忆,已经不存在任何实际障碍了。” ……整个人类记忆可以而且可能在很短的时间内为每个人提供。”
While teaching engineering at MIT, Vannevar Bush (1890–1974) built an analog computer called a differential analyzer for solving differential equations. It was a clumsy device by modern standards, mostly mechanical with a few electrical components, but it worked. Bush hired Claude Shannon as a research assistant in 1936 to run the device, an experience that got Shannon thinking systematically about switching circuits and led to his famous Master’s thesis (chapter 8). Bush turned to thinking about information storage and retrieval. Influenced by the recently expanded commercial use of microfilm (the New York Times began publishing in microform in 1935), Bush imagined the device he describes in this prophetic 1945 article as a memex. Microfilm had similarly inspired science fiction writer H. G. Wells (1938). “There is no practical obstacle whatever now,” he wrote, “to the creation of an efficient index to all human knowledge, ideas and achievements, to the creation, that is, of a complete planetary memory for all mankind. …The whole human memory can be, and probably in a short time will be, made accessible to every individual.”
相比之下,布什是以工程师的身份写作的,而不仅仅是普通的工程师。第二次世界大战期间,他曾担任科学研究与发展办公室主任,协调美国科学界援助战争的努力。例如,他非常了解哈佛大学的艾肯 Mark I,以及作为曼哈顿计划的一部分来设计原子武器所进行的困难计算。这篇文章发表的时间与布什向杜鲁门总统提交《科学,无尽的前沿》(布什,1945c)有关,这是一份关于科学未来的报告,该报告导致了国家科学基金会的成立。布什在《大西洋月刊》上发表了这篇文章,不久之后又在《生活》 (Bush,1945b)中以缩写和戏剧性插图的形式发表了这篇文章,作为他努力增加公众对科学研究的支持的一部分。
Bush, by contrast, was writing as an engineer, and not just any engineer. He had served as Director of the Office of Scientific Research and Development during World War II, marshaling the American scientific community’s efforts to aid the war effort. He was well aware, for example, of Aiken’s Mark I at Harvard and of the difficult computations done as part of the Manhattan Project to design an atomic weapon. This article appeared about the same time as Bush submitted to President Truman “Science, the Endless Frontier” (Bush, 1945c), a report on the future of science that led to the creation of the National Science Foundation. Bush published this piece in the Atlantic—and soon after in abbreviated and dramatically illustrated form in Life (Bush, 1945b)—as part of his effort to increase public support for scientific research.
布什在这里询问科学可以为人类的和平未来做出什么贡献,并提出了即时、无处不在、关联信息检索的愿景。当时可用的存储介质无法支持这样的成就——这篇文章仅因其对可用计算和存储技术的回顾以及对其改进的预测而令人着迷——但从那时起,他的功能愿景启发了许多电子系统。
Bush here asks what science can contribute to the peaceful future of humankind, and offers in response a vision of instantaneous, ubiquitous, associative information retrieval. The storage media available at the time could not support such an achievement—the article is fascinating just for its review of the available computing and storage technologies and the projections for their improvement—but his functional vision has inspired many an electronic system since then.
这不是一场科学家的战争;这是一场科学家的战争。这是一场所有人都参与的战争。科学家们将过去的职业竞争埋葬在共同事业的需求中,分享了很多东西,学到了很多东西。能够以有效的伙伴关系开展工作是令人兴奋的。现在,对于许多人来说,这似乎即将结束。科学家接下来要做什么?对于生物学家,特别是对于医学科学家来说,几乎没有优柔寡断的余地,因为他们的战争几乎不需要他们离开旧的道路。许多人确实能够在他们熟悉的和平时期实验室中进行战争研究。他们的目标基本相同。
THIS has not been a scientist’s war; it has been a war in which all have had a part. The scientists, burying their old professional competition in the demand of a common cause, have shared greatly and learned much. It has been exhilarating to work in effective partnership. Now, for many, this appears to be approaching an end. What are the scientists to do next? For the biologists, and particularly for the medical scientists, there can be little indecision, for their war has hardly required them to leave the old paths. Many indeed have been able to carry on their war research in their familiar peacetime laboratories. Their objectives remain much the same.
物理学家们的步伐受到了最猛烈的冲击,他们放弃了学术追求,转而制造奇怪的破坏性装置,他们不得不为他们意想不到的任务设计新的方法。他们在能够击退敌人的设备上尽了自己的一份力量,并与我们盟友的物理学家共同努力。他们内心感受到了成就的激动。他们是一支伟大团队的一部分。现在,随着和平的临近,人们会问,他们在哪里才能找到最值得实现的目标。
It is the physicists who have been thrown most violently off stride, who have left academic pursuits for the making of strange destructive gadgets, who have had to devise new methods for their unanticipated assignments. They have done their part on the devices that made it possible to turn back the enemy, have worked in combined effort with the physicists of our allies. They have felt within themselves the stir of achievement. They have been part of a great team. Now, as peace approaches, one asks where they will find objectives worthy of their best.
人类对科学和他的研究带来的新仪器的使用有什么持久的好处?首先,他们增强了他对物质环境的控制。他们改善了他的食物、衣服和住所;他们增强了他的安全感,使他部分地摆脱了赤裸裸的存在的束缚。他们让他对自己的生物过程有了更多的了解,从而逐渐摆脱疾病的困扰,并延长了寿命。它们阐明了他的生理和心理功能的相互作用,有望改善他的心理健康。科学提供了人与人之间最快捷的沟通;它提供了思想记录,并使人类能够操纵该记录并从中进行摘录,以便知识在整个种族而不是个人的一生中不断发展和延续。研究成果不断涌现。但越来越多的证据表明,随着专业化的扩展,我们今天正陷入困境。调查员对数千名其他工作人员的发现和结论感到震惊——他没有时间去理解这些结论,更不用说记住这些结论了。然而,专业化对于进步变得越来越必要,而在学科之间建立桥梁的努力相应地是肤浅的。
Of what lasting benefit has been man’s use of science and of the new instruments which his research brought into existence? First, they have increased his control of his material environment. They have improved his food, his clothing, his shelter; they have increased his security and released him partly from the bondage of bare existence. They have given him increased knowledge of his own biological processes so that he has had a progressive freedom from disease and an increased span of life. They are illuminating the interactions of his physiological and psychological functions, giving the promise of an improved mental health. Science has provided the swiftest communication between individuals; it has provided a record of ideas and has enabled man to manipulate and to make extracts from that record so that knowledge evolves and endures throughout the life of a race rather than that of an individual. There is a growing mountain of research. But there is increased evidence that we are being bogged down today as specialization extends. The investigator is staggered by the findings and conclusions of thousands of other workers—conclusions which he cannot find time to grasp, much less to remember, as they appear. Yet specialization becomes increasingly necessary for progress, and the effort to bridge between disciplines is correspondingly superficial.
从专业角度来说,我们传播和审查研究结果的方法已经有几代人的历史了,但迄今为止完全不足以达到其目的。如果可以评估撰写学术著作和阅读学术著作所花费的总时间,那么这些时间之间的比例很可能会令人震惊。那些认真地试图通过仔细和持续的阅读来跟上当前思想的人,即使是在有限的领域,很可能会回避一项旨在显示上个月的努力有多少可以随叫随到的考试。孟德尔的遗传学定律概念在一代人的时间里被世人遗忘了,因为他的出版物没有被少数有能力掌握和扩展它的人所接受。和这种毫无疑问,灾难正在我们周围重演,因为真正重要的成就被淹没在大量无关紧要的事情中。
Professionally our methods of transmitting and reviewing the results of research are generations old and by now are totally inadequate for their purpose. If the aggregate time spent in writing scholarly works and in reading them could be evaluated, the ratio between these amounts of time might well be startling. Those who conscientiously attempt to keep abreast of current thought, even in restricted fields, by close and continuous reading might well shy away from an examination calculated to show how much of the previous month’s efforts could be produced on call. Mendel’s concept of the laws of genetics was lost to the world for a generation because his publication did not reach the few who were capable of grasping and extending it; and this sort of catastrophe is undoubtedly being repeated all about us, as truly significant attainments become lost in the mass of the inconsequential.
困难似乎并不在于我们鉴于当今兴趣的范围和多样性而过度出版,而是出版已经远远超出了我们目前真正利用记录的能力。人类经验的总结正在以惊人的速度扩展,我们用来穿过随后的迷宫到达暂时重要的物品的手段与方帆船舶时代所使用的手段相同。但随着新的、强大的工具的投入使用,出现了变化的迹象。能够从物理意义上看到事物的光电管,能够记录所见甚至未见事物的先进摄影术,能够在比蚊子振动翅膀所用的功率更少的功率引导下控制强大力量的热电子管,阴极射线管渲染可见,事件如此短暂,相比之下,一微秒就是很长的时间,继电器组合将比任何人类操作员更可靠地执行所涉及的运动序列,并且速度快数千倍——有大量的机械辅助工具可以用来实现转换在科学记录中。
The difficulty seems to be, not so much that we publish unduly in view of the extent and variety of present day interests, but rather that publication has been extended far beyond our present ability to make real use of the record. The summation of human experience is being expanded at a prodigious rate, and the means we use for threading through the consequent maze to the momentarily important item is the same as was used in the days of square-rigged ships. But there are signs of a change as new and powerful instrumentalities come into use. Photocells capable of seeing things in a physical sense, advanced photography which can record what is seen or even what is not, thermionic tubes capable of controlling potent forces under the guidance of less power than a mosquito uses to vibrate his wings, cathode ray tubes rendering visible an occurrence so brief that by comparison a microsecond is a long time, relay combinations which will carry out involved sequences of movements more reliably than any human operator and thousands of times as fast—there are plenty of mechanical aids with which to effect a transformation in scientific records.
两个世纪前,莱布尼茨发明了一台计算机,它体现了现代键盘设备的大部分基本特征,但当时它无法投入使用。这种情况的经济学是不利的:在大规模生产之前,建造它所涉及的劳动力超过了它的使用所节省的劳动力,因为它所能完成的一切都可以通过充分使用铅笔和纸来复制。而且,它会经常发生故障,因此不能被依赖;因为在当时和之后很长一段时间里,复杂性和不可靠性是同义词。
Two centuries ago Leibniz invented a calculating machine which embodied most of the essential features of recent keyboard devices, but it could not then come into use. The economics of the situation were against it: the labor involved in constructing it, before the days of mass production, exceeded the labor to be saved by its use, since all it could accomplish could be duplicated by sufficient use of pencil and paper. Moreover, it would have been subject to frequent breakdown, so that it could not have been depended upon; for at that time and long after, complexity and unreliability were synonymous.
巴贝奇即使在他的时代得到了非常慷慨的支持,也无法生产出他伟大的算术机器。他的想法很合理,但建造和维护成本太高了。如果法老得到了详细而明确的汽车设计,并且他完全理解了这些设计,那么为一辆汽车制造数千个零件就会耗费他王国的资源,而那辆汽车就会在不久的将来抛锚。第一次去吉萨。
Babbage, even with remarkably generous support for his time, could not produce his great arithmetical machine. His idea was sound enough, but construction and maintenance costs were then too heavy. Had a Pharaoh been given detailed and explicit designs of an automobile, and had he understood them completely, it would have taxed the resources of his kingdom to have fashioned the thousands of parts for a single car, and that car would have broken down on the first trip to Giza.
现在可以非常经济地建造具有可互换零件的机器。尽管很复杂,但它们的性能可靠。看看不起眼的打字机、电影摄影机或汽车。当彻底理解后,电触点就不再粘连了。请注意自动电话交换机,它有数十万个这样的联系人,但仍然可靠。密封在薄玻璃容器中的金属蜘蛛网,加热到发出耀眼光芒的电线,简而言之,收音机的热电子管,是由亿万个制成的,在包装中扔来扔去,插入插座中 - 并且它可以工作!它的游丝部件,以及其建造过程中涉及的精确位置和对齐方式,需要行会的工匠大师花费数月的时间;现在它的建造成本为三十美分。世界已经进入了一个廉价复杂设备具有极高可靠性的时代。必然会有一些结果。
Machines with interchangeable parts can now be constructed with great economy of effort. In spite of much complexity, they perform reliably. Witness the humble typewriter, or the movie camera, or the automobile. Electrical contacts have ceased to stick when thoroughly understood. Note the automatic telephone exchange, which has hundreds of thousands of such contacts, and yet is reliable. A spider web of metal, sealed in a thin glass container, a wire heated to brilliant glow, in short, the thermionic tube of radio sets, is made by the hundred million, tossed about in packages, plugged into sockets—and it works! Its gossamer parts, the precise location and alignment involved in its construction, would have occupied a master craftsman of the guild for months; now it is built for thirty cents. The world has arrived at an age of cheap complex devices of great reliability; and something is bound to come of it.
记录如果要对科学有用,就必须不断扩展,必须存储,最重要的是必须可供查阅。今天,我们传统上通过书写和摄影来记录,然后是印刷;但我们也在胶片、蜡盘和磁线上进行记录。即使没有出现全新的记录程序,现有的这些程序也肯定处于修改和扩展的过程中。
A record if it is to be useful to science, must be continuously extended, it must be stored, and above all it must be consulted. Today we make the record conventionally by writing and photography, followed by printing; but we also record on film, on wax disks, and on magnetic wires. Even if utterly new recording procedures do not appear, these present ones are certainly in the process of modification and extension.
当然,摄影的进步不会停止。
Certainly progress in photography is not going to stop.
更快的材料和镜头、更自动的相机、更细粒度的敏感化合物以允许扩展微型相机的想法,这些都迫在眉睫。让我们将这一趋势预测为一个合乎逻辑的(即使不是不可避免的)结果。未来的相机猎手额头上有一个比核桃稍大的肿块。它拍摄3毫米见方的照片,然后进行投影或放大,毕竟只涉及到目前实践的10倍。该镜头具有通用焦距,可达到肉眼所能容纳的任何距离,仅仅是因为它的焦距较短。胡桃木上有一个内置光电管,就像我们现在至少在一台相机上一样,它可以自动调整曝光以适应各种照明。胡桃木内装有可进行一百次曝光的胶片,插入胶片夹后,操作快门和移动胶片的弹簧就一次上紧。它以全彩形式产生结果。它很可能是立体的,并用两个间隔开的玻璃眼睛进行记录,因为立体技术的显着改进指日可待。
Faster material and lenses, more automatic cameras, finer-grained sensitive compounds to allow an extension of the minicamera idea, are all imminent. Let us project this trend ahead to a logical, if not inevitable, outcome. The camera hound of the future wears on his forehead a lump a little larger than a walnut. It takes pictures 3 millimeters square, later to be projected or enlarged, which after all involves only a factor of 10 beyond present practice. The lens is of universal focus, down to any distance accommodated by the unaided eye, simply because it is of short focal length. There is a built-in photocell on the walnut such as we now have on at least one camera, which automatically adjusts exposure for a wide range of illumination. There is film in the walnut for a hundred exposures, and the spring for operating its shutter and shifting its film is wound once for all when the film clip is inserted. It produces its result in full color. It may well be stereoscopic, and record with two spaced glass eyes, for striking improvements in stereoscopic technique are just around the corner.
触发百叶窗的绳子可能会伸到人的袖子里,他的手指就能轻松够到。快速挤压,照片就拍好了。一副普通眼镜的镜片顶部附近有一条方形细纹,该细纹不妨碍普通视力。当一个物体出现在那个方块中时,它就会按照它的图片排列起来。当未来的科学家在实验室或现场走动时,每当他看到一些值得记录的东西时,他都会按下快门,然后将其放入,甚至没有听到咔嗒声。这一切都太棒了吗?它唯一奇妙的地方是可以通过使用它来制作尽可能多的图片。
The cord which trips its shutter may reach down a man’s sleeve within easy reach of his fingers. A quick squeeze, and the picture is taken. On a pair of ordinary glasses is a square of fine lines near the top of one lens, where it is out of the way of ordinary vision. When an object appears in that square, it is lined up for its picture. As the scientist of the future moves about the laboratory or the field, every time he looks at something worthy of the record, he trips the shutter and in it goes, without even an audible click. Is this all fantastic? The only fantastic thing about it is the idea of making as many pictures as would result from its use.
会有干摄影吗?它已经以两种形式出现了。当布雷迪拍摄内战照片时,曝光时底版必须是湿的。现在在开发过程中它必须是湿的。将来也许根本不需要弄湿它。长期以来,存在着用重氮染料浸渍的胶片,其无需显影即可形成图像,因此只要相机被操作,图像就已经存在。暴露在氨气中会破坏未曝光的染料,然后可以将照片取出到光线下进行检查。这个过程现在很慢,但有人可能会加快它的速度,而且它不存在像现在让摄影研究人员忙碌的颗粒困难。通常,能够拍摄相机并立即查看照片是有利的。
Will there be dry photography? It is already here in two forms. When Brady made his Civil War pictures, the plate had to be wet at the time of exposure. Now it has to be wet during development instead. In the future perhaps it need not be wetted at all. There have long been films impregnated with diazo dyes which form a picture without development, so that it is already there as soon as the camera has been operated. An exposure to ammonia gas destroys the unexposed dye, and the picture can then be taken out into the light and examined. The process is now slow, but someone may speed it up, and it has no grain difficulties such as now keep photographic researchers busy. Often it would be advantageous to be able to snap the camera and to look at the picture immediately.
现在使用的另一个过程也很慢,而且或多或少有些笨拙。五十年来一直使用浸渍纸,由于纸中包含的碘化合物产生化学变化,在电接触接触到它们的每个点处都会变黑。它们被用来做记录,因为指针在它们上面移动会留下痕迹。如果指针上的电位随着其移动而变化,则线条根据电位变亮或变暗。
Another process now in use is also slow, and more or less clumsy. For fifty years impregnated papers have been used which turn dark at every point where an electrical contact touches them, by reason of the chemical change thus produced in an iodine compound included in the paper. They have been used to make records, for a pointer moving across them can leave a trail behind. If the electrical potential on the pointer is varied as it moves, the line becomes light or dark in accordance with the potential.
该方案现在用于传真传输。指针在纸上一条一条地画出一组间隔很近的线。当它移动时,它的电势会根据从远处站通过电线接收到的变化电流而变化,这些变化是由类似地扫描图片的光电管产生的。在每一瞬间,所画线的暗度都等于光电管观察到的图片上点的暗度。因此,当整个图片被覆盖时,接收端就会出现一个副本。……
This scheme is now used in facsimile transmission. The pointer draws a set of closely spaced lines across the paper one after another. As it moves, its potential is varied in accordance with a varying current received over wires from a distant station, where these variations are produced by a photocell which is similarly scanning a picture. At every instant the darkness of the line being drawn is made equal to the darkness of the point on the picture being observed by the photocell. Thus, when the whole picture has been covered, a replica appears at the receiving end. …
与干摄影一样,显微摄影还有很长的路要走。…假设线性比率为 100 以供将来使用。考虑与纸张厚度相同的薄膜,尽管更薄的薄膜当然也可以使用。即使在这些条件下,大部分普通书籍记录与其缩微胶卷复制品之间的总因数仍为 10,000。大英百科全书可以缩小到火柴盒的大小。一个拥有一百万册图书的图书馆可以压缩到一张桌子的一端。如果自从活字印刷术发明以来,人类已经以杂志、报纸、书籍、小册子、广告简介、信件的形式制作了一份总记录,其卷数相当于十亿本书,那么整个事件,经过组装和压缩,可以用移动货车拖走。当然,仅仅压缩是不够的。不仅要制作、保存记录,还要能够查阅,这方面的事情以后再说。即使是现代的大型图书馆也不被普遍查阅;它被一些人啃食。
Like dry photography, microphotography still has a long way to go. …Assume a linear ratio of 100 for future use. Consider film of the same thickness as paper, although thinner film will certainly be usable. Even under these conditions there would be a total factor of 10,000 between the bulk of the ordinary record on books, and its microfilm replica. The Encyclopedia Britannica could be reduced to the volume of a matchbox. A library of a million volumes could be compressed into one end of a desk. If the human race has produced since the invention of movable type a total record, in the form of magazines, newspapers, books, tracts, advertising blurbs, correspondence, having a volume corresponding to a billion books, the whole affair, assembled and compressed, could be lugged off in a moving van. Mere compression, of course, is not enough; one needs not only to make and store a record but also be able to consult it, and this aspect of the matter comes later. Even the modern great library is not generally consulted; it is nibbled at by a few.
然而,就成本而言,压缩很重要。《大英百科全书》的微缩胶卷的材料成本为五分钱,而且只要一美分就可以邮寄到任何地方。印刷一百万份需要花费多少钱?打印一张大版报纸的成本仅为一美分的一小部分。《大英百科全书》的全部材料以缩微胶卷的形式呈现在一张八又二分之一乘十一英寸的纸上。一旦可用,通过未来的照相复制方法,大批量复制品的价格可能会超出材料成本,每件一美分。准备原件吗?这就介绍了该主题的下一个方面。
Compression is important, however, when it comes to costs. The material for the microfilm Britannica would cost a nickel, and it could be mailed anywhere for a cent. What would it cost to print a million copies? To print a sheet of newspaper, in a large edition, costs a small fraction of a cent. The entire material of the Britannica in reduced microfilm form would go on a sheet eight and one-half by eleven inches. Once it is available, with the photographic reproduction methods of the future, duplicates in large quantities could probably be turned out for a cent apiece beyond the cost of materials. The preparation of the original copy? That introduces the next aspect of the subject.
为了进行记录,我们现在按下铅笔或敲击打字机。然后是消化和修正的过程,然后是复杂的排版、印刷和分发过程。考虑到该程序的第一阶段,未来的作者是否会停止用手或打字机书写并直接与记录对话?他通过与速记员或蜡筒交谈来间接做到这一点;但如果他希望他的演讲直接产生打印记录,那么这些要素都存在。他需要做的就是利用现有的机制并改变他的语言。
To make the record, we now push a pencil or tap a typewriter. Then comes the process of digestion and correction, followed by an intricate process of typesetting, printing, and distribution. To consider the first stage of the procedure, will the author of the future cease writing by hand or typewriter and talk directly to the record? He does so indirectly, by talking to a stenographer or a wax cylinder; but the elements are all present if he wishes to have his talk directly produce a typed record. All he needs to do is to take advantage of existing mechanisms and to alter his language.
在最近的一次世界博览会上,展示了一种称为语音合成器的机器。一个女孩敲击了它的按键,它发出了可辨认的语言。在任何时候都没有人类声带参与该过程;这些按键只是结合了一些电力产生的振动,并将这些振动传递到扬声器。在贝尔实验室有一种与此相反的机器,称为声码器。扬声器被麦克风取代,麦克风拾取声音。对它说话,相应的键就会移动。这可能是假定系统的要素之一。
At a recent World Fair a machine called a Voder was shown. A girl stroked its keys and it emitted recognizable speech. No human vocal chords entered into the procedure at any point; the keys simply combined some electrically produced vibrations and passed these on to a loud-speaker. In the Bell Laboratories there is the converse of this machine, called a Vocoder. The loudspeaker is replaced by a microphone, which picks up sound. Speak to it, and the corresponding keys move. This may be one element of the postulated system.
另一个要素是速记,这是公共会议上经常遇到的有点令人不安的装置。一个女孩懒洋洋地敲击着琴键,环视房间,有时用令人不安的目光看着扬声器。从中出现一条打印条,以语音简化的语言记录了说话者应该说的话。后来,这条条带被重新打印成普通语言,因为在它的新生形式中,只有入门者才能理解。将这两个元素结合起来,让声码器运行速记,结果就是一台在与人交谈时会打字的机器。
The other element is found in the stenotype, that somewhat disconcerting device encountered usually at public meetings. A girl strokes its keys languidly and looks about the room and sometimes at the speaker with a disquieting gaze. From it emerges a typed strip which records in a phonetically simplified language a record of what the speaker is supposed to have said. Later this strip is retyped into ordinary language, for in its nascent form it is intelligible only to the initiated. Combine these two elements, let the Vocoder run the stenotype, and the result is a machine which types when talked to.
确实,我们现在的语言并不是特别适应这种机械化。奇怪的是,通用语言的发明者并没有抓住创造一种更适合传输和记录语音技术的语言的想法。机械化可能会迫使这个问题发生,特别是在科学领域;因此,科学术语对于外行来说将变得更加难以理解。
Our present languages are not especially adapted to this sort of mechanization, it is true. It is strange that the inventors of universal languages have not seized upon the idea of producing one which better fitted the technique for transmitting and recording speech. Mechanization may yet force the issue, especially in the scientific field; whereupon scientific jargon would become still less intelligible to the layman.
人们现在可以想象一位未来的调查员在他的实验室里。他的双手是自由的,他没有被固定。当他四处走动和观察时,他会拍照和评论。时间会自动记录,将两个记录联系在一起。如果他进入现场,他可以通过无线电连接到他的录音机。晚上,当他思考笔记时,他再次将自己的评论记录下来。他的打字记录和他的照片都可能是微型的,以便他将它们投影以供检查。
One can now picture a future investigator in his laboratory. His hands are free, and he is not anchored. As he moves about and observes, he photographs and comments. Time is automatically recorded to tie the two records together. If he goes into the field, he may be connected by radio to his recorder. As he ponders over his notes in the evening, he again talks his comments into the record. His typed record, as well as his photographs, may both be in miniature, so that he projects them for examination.
然而,在数据和观察的收集、从现有记录中提取平行材料以及最终将新材料插入到共同记录的总体中之间需要发生很多事情。对于成熟的思想来说,没有机械的替代品。但创造性思维和本质上的重复性思维是非常不同的事情。对于后者,存在并且可能存在强大的机械辅助装置。
Much needs to occur, however, between the collection of data and observations, the extraction of parallel material from the existing record, and the final insertion of new material into the general body of the common record. For mature thought there is no mechanical substitute. But creative thought and essentially repetitive thought are very different things. For the latter there are, and may be, powerful mechanical aids.
添加一列数字是一个重复的思维过程,很久以前它就被正确地归咎于机器。诚然,机器有时是由键盘控制的,在读取数字并按下相应的键时会产生某种想法,但即使这样也是可以避免的。机器已经被制造出来,可以通过光电管读取键入的数字,然后按下相应的键;这些是用于扫描类型的光电管、用于对随后的变化进行分类的电路以及用于将结果解释为螺线管的动作以拉下按键的继电器电路的组合。
Adding a column of figures is a repetitive thought process, and it was long ago properly relegated to the machine. True, the machine is sometimes controlled by a keyboard, and thought of a sort enters in reading the figures and poking the corresponding keys, but even this is avoidable. Machines have been made which will read typed figures by photocells and then depress the corresponding keys; these are combinations of photocells for scanning the type, electric circuits for sorting the consequent variations, and relay circuits for interpreting the result into the action of solenoids to pull the keys down.
所有这些复杂的事情都是需要的,因为我们学会了书写数字的笨拙方式。如果我们简单地通过卡片上一组点的配置来记录它们的位置,自动读取机制将变得相对简单。事实上,如果这些点是孔,我们就有 Hollerith 很久以前生产的用于人口普查的打孔卡机,现在在整个商业中使用。如果没有这些机器,某些类型的复杂企业几乎无法运营。
All this complication is needed because of the clumsy way in which we have learned to write figures. If we recorded them positionally, simply by the configuration of a set of dots on a card, the automatic reading mechanism would become comparatively simple. In fact if the dots are holes, we have the punched-card machine long ago produced by Hollerith for the purposes of the census, and now used throughout business. Some types of complex businesses could hardly operate without these machines.
添加只是一项操作。执行算术计算还涉及减法、乘法和除法,此外还涉及一些临时存储结果、从存储中取出以进行进一步操作以及通过打印记录最终结果的方法。用于这些目的的机器现在有两种类型:用于会计等的键盘机,手动控制插入数据,并且就操作顺序而言通常自动控制;打孔卡机通常将单独的操作委托给一系列机器,然后将卡片从一台机器转移到另一台机器。两种形式都非常有用;但就复杂计算而言,两者仍处于萌芽状态。
Adding is only one operation. To perform arithmetical computation involves also subtraction, multiplication, and division, and in addition some method for temporary storage of results, removal from storage for further manipulation, and recording of final results by printing. Machines for these purposes are now of two types: keyboard machines for accounting and the like, manually controlled for the insertion of data, and usually automatically controlled as far as the sequence of operations is concerned; and punched-card machines in which separate operations are usually delegated to a series of machines, and the cards then transferred bodily from one to another. Both forms are very useful; but as far as complex computations are concerned, both are still in embryo.
在物理学家发现计算宇宙射线是可取的之后不久,快速电子计数就出现了。为了自己的目的,物理学家立即建造了能够以每秒 100,000 次的速度计算电脉冲的热电子管设备。未来的先进算术机器本质上将是电动的,它们的运行速度将是现在的 100 倍,甚至更高。
Rapid electrical counting appeared soon after the physicists found it desirable to count cosmic rays. For their own purposes the physicists promptly constructed thermionic-tube equipment capable of counting electrical impulses at the rate of 100,000 a second. The advanced arithmetical machines of the future will be electrical in nature, and they will perform at 100 times present speeds, or more.
此外,它们将比现有的商用机器更加通用,因此可以轻松适应各种操作。它们将由控制卡或胶片控制,它们将选择自己的数据并根据插入的指令对其进行操作,它们将以极高的速度执行复杂的算术计算,并以以下形式记录结果:易于分发或供以后进一步操作。这样的机器将会有巨大的需求。其中一个将从一屋子女孩那里获取指令和数据,这些女孩配备了简单的键盘,并每隔几分钟提供计算结果表。在数百万人做复杂事情的详细事务中,总会有大量的事情需要计算。
Moreover, they will be far more versatile than present commercial machines, so that they may readily be adapted for a wide variety of operations. They will be controlled by a control card or film, they will select their own data and manipulate it in accordance with the instructions thus inserted, they will perform complex arithmetical computations at exceedingly high speeds, and they will record results in such form as to be readily available for distribution or for later further manipulation. Such machines will have enormous appetites. One of them will take instructions and data from a whole roomful of girls armed with simple key board punches, and will deliver sheets of computed results every few minutes. There will always be plenty of things to compute in the detailed affairs of millions of people doing complicated things.
然而,重复的思维过程并不局限于算术和统计问题。事实上,每当人们按照既定的逻辑过程组合和记录事实时,思维的创造性方面只涉及数据的选择和所采用的过程,而其后的操作本质上是重复性的,因此是合适的事情被降级为机器。沿着这些思路,除了算术的界限之外,还没有做太多的事情,而主要是由于经济形势的原因,可能会做更多的事情。一旦生产方法足够先进,商业需求和广阔的市场显然在等待,保证了大规模生产的算术机器的出现。
The repetitive processes of thought are not confined however, to matters of arithmetic and statistics. In fact, every time one combines and records facts in accordance with established logical processes, the creative aspect of thinking is concerned only with the selection of the data and the process to be employed and the manipulation thereafter is repetitive in nature and hence a fit matter to be relegated to the machine. Not so much has been done along these lines, beyond the bounds of arithmetic, as might be done, primarily because of the economics of the situation. The needs of business and the extensive market obviously waiting, assured the advent of mass-produced arithmetical machines just as soon as production methods were sufficiently advanced.
对于高级分析机器来说,这种情况就不存在了;因为无论过去还是现在都不存在广阔的市场;使用先进数据处理方法的用户只是人口中的一小部分。然而,有一些机器可以求解微分方程以及函数方程和积分方程。有许多特殊的机器,例如预测潮汐的谐波合成器。还会有更多,肯定首先出现在科学家手中,而且数量很少。
With machines for advanced analysis no such situation existed; for there was and is no extensive market; the users of advanced methods of manipulating data are a very small part of the population. There are, however, machines for solving differential equations—and functional and integral equations, for that matter. There are many special machines, such as the harmonic synthesizer which predicts the tides. There will be many more, appearing certainly first in the hands of the scientist and in small numbers.
如果科学推理仅限于算术的逻辑过程,那么我们对物理世界的理解就不会走得太远。人们不妨尝试完全通过使用概率数学来掌握扑克游戏。算盘的珠子串在平行的电线上,使阿拉伯人比世界其他地区早几个世纪就掌握了位置计数和零的概念。它是一个有用的工具——非常有用,以至于它仍然存在。
If scientific reasoning were limited to the logical processes of arithmetic, we should not get far in our understanding of the physical world. One might as well attempt to grasp the game of poker entirely by the use of the mathematics of probability. The abacus, with its beads strung on parallel wires, led the Arabs to positional numeration and the concept of zero many centuries before the rest of the world; and it was a useful tool—so useful that it still exists.
这与算盘和现代键盘记账机相差甚远。这将是迈向未来算术机器的同等一步。但即使是这台新机器也无法将科学家带到他需要去的地方。如果高等数学的使用者想要将他们的大脑解放出来,去做一些比按照既定规则进行重复的详细转换更多的事情,那么就必须从对高等数学的繁琐的详细操作中获得解脱。数学家不是一个可以轻易操纵数字的人;他常常做不到。他甚至不是一个能够轻易地用微积分进行方程变换的人。他主要是一个善于在高层次上使用符号逻辑的人,尤其是他在选择他所采用的操作过程时具有直觉判断力。
It is a far cry from the abacus to the modern keyboard accounting machine. It will be an equal step to the arithmetical machine of the future. But even this new machine will not take the scientist where he needs to go. Relief must be secured from laborious detailed manipulation of higher mathematics as well, if the users of it are to free their brains for something more than repetitive detailed transformations in accordance with established rules. A mathematician is not a man who can readily manipulate figures; often he cannot. He is not even a man who can readily perform the transformations of equations by the use of calculus. He is primarily an individual who is skilled in the use of symbolic logic on a high plane, and especially he is a man of intuitive judgment in the choice of the manipulative processes he employs.
他应该能够将所有其他事情交给他的机械装置,就像他将汽车的推进交给引擎盖下的复杂机械装置一样自信。只有这样,数学才能真正有效地将不断增长的原子学知识有效地解决化学、冶金和生物学的高级问题。出于这个原因,仍然有更多的机器来为科学家处理高级数学。其中一些非常奇特,足以满足当今文明文物最挑剔的鉴赏家的需求。
All else he should be able to turn over to his mechanism, just as confidently as he turns over the propelling of his car to the intricate mechanism under the hood. Only then will mathematics be practically effective in bringing the growing knowledge of atomistics to the useful solution of the advanced problems of chemistry, metallurgy, and biology. For this reason there still come more machines to handle advanced mathematics for the scientist. Some of them will be sufficiently bizarre to suit the most fastidious connoisseur of the present artifacts of civilization.
然而,科学家并不是唯一一个通过使用逻辑过程来操纵数据和检查他周围的世界的人,尽管他有时通过采用任何变得有逻辑性的人来保留这种外观,就像英国人的方式一样。劳工领袖被提升为爵士。每当采用思维的逻辑过程时——也就是说,每当思维沿着公认的轨道运行时——机器就有机会。形式逻辑曾经是教师手中试探学生灵魂的利器。只需巧妙地使用继电器电路,就可以很容易地构造一台能够根据形式逻辑操纵前提的机器。
The scientist, however, is not the only person who manipulates data and examines the world about him by the use of logical processes, although he sometimes preserves this appearance by adopting into the fold anyone who becomes logical, much in the manner in which a British labor leader is elevated to knighthood. Whenever logical processes of thought are employed—that is, whenever thought for a time runs along an accepted groove—there is an opportunity for the machine. Formal logic used to be a keen instrument in the hands of the teacher in his trying of students’ souls. It is readily possible to construct a machine which will manipulate premises in accordance with formal logic, simply by the clever use of relay circuits.
将一组前提放入这样的设备中并转动曲柄,它会很容易地得出一个又一个的结论,所有这些都符合逻辑定律,并且不会比键盘加法机预期的更多错误。[编辑:与第 8 页比较。]
Put a set of premises into such a device and turn the crank, and it will readily pass out conclusion after conclusion, all in accordance with logical law, and with no more slips than would be expected of a keyboard adding machine. [EDITOR: Compare to page 8.]
逻辑可能变得非常困难,毫无疑问,在使用逻辑时提供更多保证会更好。用于高级分析的机器通常是方程求解器。方程变换器的想法开始出现,它将根据严格且相当先进的逻辑重新排列方程所表达的关系。数学家表达其关系的极其粗糙的方式阻碍了进步。他们使用的象征主义与托普西一样,缺乏一致性。在这个最符合逻辑的领域里,这是一个奇怪的事实。
Logic can become enormously difficult, and it would undoubtedly be well to produce more assurance in its use. The machines for higher analysis have usually been equation solvers. Ideas are beginning to appear for equation transformers, which will rearrange the relationship expressed by an equation in accordance with strict and rather advanced logic. Progress is inhibited by the exceedingly crude way in which mathematicians express their relationships. They employ a symbolism which grew like Topsy and has little consistency; a strange fact in that most logical field.
显然,一种新的象征主义(可能是位置象征主义)必须先于将数学变换简化为机器过程。然后,超越数学家的严格逻辑,就是逻辑在日常事务中的应用。有一天,我们可能会在一台机器上结束争论,就像我们现在在收银机上输入销售数据一样。但逻辑机器看起来不会像收银机,即使是精简模型也是如此。
A new symbolism, probably positional, must apparently precede the reduction of mathematical transformations to machine processes. Then, on beyond the strict logic of the mathematician, lies the application of logic in everyday affairs. We may some day click off arguments on a machine with the same assurance that we now enter sales on a cash register. But the machine of logic will not look like a cash register, even of the streamlined model.
关于想法的操纵和将它们插入记录就到此为止了。到目前为止,我们的情况似乎比以前更糟——因为我们可以极大地扩大记录;然而,即使就目前的规模而言,我们也很难查阅它。这不仅仅是为了科学研究目的而提取数据的问题。它涉及人类通过继承所获得的知识而获利的整个过程。使用的主要行为是选择,但在这里我们确实停下来了。可能有数百万个美好的想法,以及它们所基于的经验的描述,都被封装在可接受的建筑形式的石墙内;但如果学者通过勤奋的搜索每周只能得到一篇,那么他的综合就不可能跟上当前的情况。
So much for the manipulation of ideas and their insertion into the record. Thus far we seem to be worse off than before—for we can enormously extend the record; yet even in its present bulk we can hardly consult it. This is a much larger matter than merely the extraction of data for the purposes of scientific research; it involves the entire process by which man profits by his inheritance of acquired knowledge. The prime action of use is selection, and here we are halting indeed. There may be millions of fine thoughts, and the account of the experience on which they are based, all encased within stone walls of acceptable architectural form; but if the scholar can get at only one a week by diligent search, his syntheses are not likely to keep up with the current scene.
广义上的选择,就是木工手中的石锛。然而,从狭义上讲,在其他领域,选择已经机械地做了一些事情。一家工厂的人事官员将一叠几千张员工卡放入选择机中,按照既定惯例设置代码,并在短时间内生成一份居住在特伦顿且懂西班牙语的所有员工的名单。即使是这样的设备,例如将一组指纹与存档的 500 万个指纹进行匹配时也太慢了。此类选择设备很快就会从目前每分钟数百次的数据审查速度加快。通过使用光电管和缩微胶卷,他们将以每秒一千次的速度调查物品,并打印出所选物品的副本。
Selection, in this broad sense, is a stone adze in the hands of a cabinetmaker. Yet, in a narrow sense and in other areas, something has already been done mechanically on selection. The personnel officer of a factory drops a stack of a few thousand employee cards into a selecting machine, sets a code in accordance with an established convention, and produces in a short time a list of all employees who live in Trenton and know Spanish. Even such devices are much too slow when it comes, for example, to matching a set of fingerprints with one of five million on file. Selection devices of this sort will soon be speeded up from their present rate of reviewing data at a few hundred a minute. By the use of photocells and microfilm they will survey items at the rate of a thousand a second, and will print out duplicates of those selected.
然而,这个过程是简单的选择:它通过依次检查一大组项目中的每一个项目,并挑选出具有某些特定特征的项目来进行。还有另一种选择形式,最好的例子就是自动电话交换机。您拨打一个号码,机器就会选择并连接一百万个可能的电台中的一个。它并没有涵盖所有这些。它仅关注由第一个数字给出的类,然后仅关注由第二个数字给出的该类的子类,依此类推;从而快速且几乎准确无误地到达选定的车站。做出选择需要几秒钟的时间,但如果经济上可以保证提高速度,则可以加快该过程。如有必要,可以通过用热电子管开关代替机械开关来使其速度极快,从而可以在百分之一秒内完成完整的选择。没有人愿意花费必要的钱来对电话系统进行这种改变,但总体想法适用于其他地方。
This process, however, is simple selection: it proceeds by examining in turn every one of a large set of items, and by picking out those which have certain specified characteristics. There is another form of selection best illustrated by the automatic telephone exchange. You dial a number and the machine selects and connects just one of a million possible stations. It does not run over them all. It pays attention only to a class given by a first digit, then only to a subclass of this given by the second digit, and so on; and thus proceeds rapidly and almost unerringly to the selected station. It requires a few seconds to make the selection, although the process could be speeded up if increased speed were economically warranted. If necessary, it could be made extremely fast by substituting thermionic-tube switching for mechanical switching, so that the full selection could be made in one one-hundredth of a second. No one would wish to spend the money necessary to make this change in the telephone system, but the general idea is applicable elsewhere.
以大型百货商店的平淡问题为例。每次进行收费销售时,都有很多事情要做。库存需要修改,销售人员需要获得销售信用,一般账户需要录入,最重要的是,需要向客户收费。已经开发了一个中央记录设备,其中大部分工作很方便地完成。推销员将顾客的身份证、他自己的卡以及从所售商品中取出的卡放在架子上——所有这些都是打孔卡。当他拉动杠杆时,通过孔进行接触,中心点的机械进行必要的计算和输入,并打印正确的收据以供销售员传递给客户。
Take the prosaic problem of the great department store. Every time a charge sale is made, there are a number of things to be done. The inventory needs to be revised, the salesman needs to be given credit for the sale, the general accounts need an entry, and, most important, the customer needs to be charged. A central records device has been developed in which much of this work is done conveniently. The salesman places on a stand the customer’s identification card, his own card, and the card taken from the article sold—all punched cards. When he pulls a lever, contacts are made through the holes, machinery at a central point makes the necessary computations and entries, and the proper receipt is printed for the salesman to pass to the customer.
但可能有一万个收费客户与商店做生意,在完成全部操作之前,必须有人选择正确的卡并将其插入中央办公室。现在,快速选择可以在一两瞬间将正确的卡片滑入到位,然后将其返回。然而,另一个困难出现了。必须有人读取卡上的总计,以便机器可以将其计算的项目添加到其中。可以想象,这些卡片可能属于我所描述的干摄影类型。然后可以通过光电管读取现有的总数,并通过电子束输入新的总数。
But there may be ten thousand charge customers doing business with the store, and before the full operation can be completed someone has to select the right card and insert it at the central office. Now rapid selection can slide just the proper card into position in an instant or two, and return it afterward. Another difficulty occurs, however. Someone must read a total on the card, so that the machine can add its computed item to it. Conceivably the cards might be of the dry photography type I have described. Existing totals could then be read by photocell, and the new total entered by an electron beam.
这些卡片可能是微型的,因此它们占用的空间很小。他们必须迅速行动。它们不需要被转移很远,而只需转移到适当的位置,以便光电管和记录器可以对它们进行操作。位置点可以输入数据。到月底,机器就可以轻松地读取这些信息并打印普通账单。通过管选择,开关中不涉及机械部件,只需很少的时间即可将正确的卡投入使用——整个操作一秒钟就足够了。如果需要的话,卡上的整个记录可以由钢板上的磁点制成,而不是通过光学观察的点,遵循波尔森很久以前在磁线上放置语音的方案。该方法具有简单、易于擦除的优点。然而,通过使用摄影,人们可以利用电视设备中常见的过程以放大的形式在远处投影该记录。
The cards may be in miniature, so that they occupy little space. They must move quickly. They need not be transferred far, but merely into position so that the photocell and recorder can operate on them. Positional dots can enter the data. At the end of the month a machine can readily be made to read these and to print an ordinary bill. With tube selection, in which no mechanical parts are involved in the switches, little time need be occupied in bringing the correct card into use—a second should suffice for the entire operation. The whole record on the card may be made by magnetic dots on a steel sheet if desired, instead of dots to be observed optically, following the scheme by which Poulsen long ago put speech on a magnetic wire. This method has the advantage of simplicity and ease of erasure. By using photography, however one can arrange to project the record in enlarged form and at a distance by using the process common in television equipment.
人们可以考虑快速选择这种形式,并将远距离投影用于其他目的。能够在操作员面前在一两秒内键入一百万张纸,然后可以在其中添加注释,这在很多方面都具有启发性。它甚至可能在图书馆中有用,但那是另一回事了。无论如何,现在可能出现一些有趣的组合。例如,人们可以以结合语音控制打字机描述的方式对着麦克风说话,从而做出他的选择。它肯定会击败普通的档案管理员。
One can consider rapid selection of this form, and distant projection for other purposes. To be able to key one sheet of a million before an operator in a second or two, with the possibility of then adding notes thereto, is suggestive in many ways. It might even be of use in libraries, but that is another story. At any rate, there are now some interesting combinations possible. One might, for example, speak to a microphone, in the manner described in connection with the speech controlled typewriter, and thus make his selections. It would certainly beat the usual file clerk.
然而,选择问题的真正核心不仅仅是图书馆采用机制的滞后,或者缺乏可供其使用的设备的开发。我们在获取记录方面的无能很大程度上是由于索引系统的人为性造成的。当任何类型的数据放入存储中时,它们都会按字母或数字顺序归档,并且通过从子类到子类追踪来找到信息(如果有)。它只能位于一处,除非使用重复项;必须有规则来确定哪条路径将找到它,而这些规则很麻烦。此外,找到一件物品后,就必须从系统中出来,重新进入一条新的道路。
The real heart of the matter of selection, however, goes deeper than a lag in the adoption of mechanisms by libraries, or a lack of development of devices for their use. Our ineptitude in getting at the record is largely caused by the artificiality of systems of indexing. When data of any sort are placed in storage, they are filed alphabetically or numerically, and information is found (when it is) by tracing it down from subclass to subclass. It can be in only one place, unless duplicates are used; one has to have rules as to which path will locate it, and the rules are cumbersome. Having found one item, moreover, one has to emerge from the system and re-enter on a new path.
人类的思维并不是这样运作的。它通过协会运作。当它抓住一件物品时,它会根据大脑细胞所携带的某种复杂的轨迹网,立即捕捉到由思想联想所暗示的下一件物品。当然,它还有其他特征;不经常遵循的踪迹很容易消失,物品不是完全永久的,记忆是短暂的。然而,行动的速度、轨迹的错综复杂、脑海中画面的细节,都比自然界中的其他一切更令人惊叹。
The human mind does not work that way. It operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain. It has other characteristics, of course; trails that are not frequently followed are prone to fade, items are not fully permanent, memory is transitory. Yet the speed of action, the intricacy of trails, the detail of mental pictures, is awe-inspiring beyond all else in nature.
人类不能完全希望人为地复制这种心理过程,但他当然应该能够从中学习。他甚至可能在一些小方面有所进步,因为他的记录相对来说是永久性的。然而,从类比中得出的第一个想法涉及选择。通过关联而不是索引进行选择可能仍会被机械化。因此,我们不能指望能够与大脑追随联想轨迹的速度和灵活性相媲美,但在从存储中复活的物品的持久性和清晰度方面,应该可以果断地击败大脑。
Man cannot hope fully to duplicate this mental process artificially, but he certainly ought to be able to learn from it. In minor ways he may even improve, for his records have relative permanency. The first idea, however, to be drawn from the analogy concerns selection. Selection by association, rather than indexing, may yet be mechanized. One cannot hope thus to equal the speed and flexibility with which the mind follows an associative trail, but it should be possible to beat the mind decisively in regard to the permanence and clarity of the items resurrected from storage.
考虑未来供个人使用的设备,它是一种机械化的私人文件和图书馆。它需要一个名字,随便造一个名字,“memex”就可以了。memex 是一种个人存储所有书籍、记录和通讯的设备,并且是机械化的,因此可以以超快的速度和灵活性进行查阅。这是对他记忆的扩大的亲密补充。
Consider a future device for individual use, which is a sort of mechanized private file and library. It needs a name, and, to coin one at random, “memex” will do. A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.
它由一张桌子组成,虽然它可能可以远距离操作,但它主要是他工作的家具。顶部是倾斜的半透明屏幕,可以将材料投影在屏幕上,方便阅读。有一个键盘、一组按钮和控制杆。否则它看起来就像一张普通的桌子。
It consists of a desk, and while it can presumably be operated from a distance, it is primarily the piece of furniture at which he works. On the top are slanting translucent screens, on which material can be projected for convenient reading. There is a keyboard, and sets of buttons and levers. Otherwise it looks like an ordinary desk.
一端是存储的材料。改进的缩微胶卷很好地解决了体积问题。麦麦克斯内部只有一小部分用于存储,其余部分用于机械装置。然而,如果用户每天插入 5000 页材料,他将需要数百年才能填满存储库,因此他可以随意输入材料。
In one end is the stored material. The matter of bulk is well taken care of by improved microfilm. Only a small part of the interior of the memex is devoted to storage, the rest to mechanism. Yet if the user inserted 5000 pages of material a day it would take him hundreds of years to fill the repository, so he can be profligate and enter material freely.
大多数 memex 内容都是以缩微胶片形式购买的,可以随时插入。各种书籍、图片、最新期刊、报纸就这样被获取并放置到位。商务信函也走同样的道路。并且有直接进入的规定。memex 的顶部是一个透明的压板。上面放着手写笔记、照片、备忘录和各种各样的东西。当其中一个就位时,按下控制杆即可将其拍摄到模因胶片的一部分中的下一个空白区域,采用干式摄影。当然,可以通过通常的索引方案来查询记录。如果用户想要查阅某本书,他在键盘上敲击该书的代码,该书的扉页立即出现在他面前,投影到他的一个观看位置。常用的密码很容易记忆,所以他很少查阅密码本;但当他这样做时,只需轻按一个键即可将其投射出来供他使用。此外,他还有补充杠杆。当他将其中一个杠杆向右偏转时,他会浏览面前的书,每一页依次以仅允许一眼认出每一页的速度投影。如果他将球进一步向右偏转,他就会迈步一次浏览这本书 10 页;更进一步,一次100页。向左偏转给了他同样的向后控制能力。
Most of the memex contents are purchased on microfilm ready for insertion. Books of all sorts, pictures, current periodicals, newspapers, are thus obtained and dropped into place. Business correspondence takes the same path. And there is provision for direct entry. On the top of the memex is a transparent platen. On this are placed longhand notes, photographs, memoranda, all sorts of things. When one is in place, the depression of a lever causes it to be photographed onto the next blank space in a section of the memex film, dry photography being employed. There is, of course, provision for consultation of the record by the usual scheme of indexing. If the user wishes to consult a certain book, he taps its code on the keyboard, and the title page of the book promptly appears before him, projected onto one of his viewing positions. Frequently-used codes are mnemonic, so that he seldom consults his code book; but when he does, a single tap of a key projects it for his use. Moreover, he has supplemental levers. On deflecting one of these levers to the right he runs through the book before him, each page in turn being projected at a speed which just allows a recognizing glance at each. If he deflects it further to the right, he steps through the book 10 pages at a time; still further at 100 pages at a time. Deflection to the left gives him the same control backwards.
一个特殊的按钮可以立即将他转到索引的第一页。因此,他的图书馆中的任何一本书都可以比从书架上取出来更方便地调用和查阅。由于他有多个投影位置,因此他可以将一个项目留在原位,同时调用另一个项目。他可以利用一种可能的干摄影类型来添加旁注和评论,甚至可以进行安排,以便他可以通过手写笔方案来完成此操作,例如现在在铁路候车室中看到的电话记录仪中使用的方法,只是就好像他面前有实体页面一样。
A special button transfers him immediately to the first page of the index. Any given book of his library can thus be called up and consulted with far greater facility than if it were taken from a shelf. As he has several projection positions, he can leave one item in position while he calls up another. He can add marginal notes and comments, taking advantage of one possible type of dry photography, and it could even be arranged so that he can do this by a stylus scheme, such as is now employed in the telautograph seen in railroad waiting rooms, just as though he had the physical page before him.
所有这些都是传统的,除了当今机制和小工具的向前发展。然而,它为关联索引提供了直接的步骤,其基本思想是可以使任何项目随意立即自动选择另一个项目。这是memex 的基本特征。将两个物品绑在一起的过程很重要。
All this is conventional, except for the projection forward of present-day mechanisms and gadgetry. It affords an immediate step, however, to associative indexing, the basic idea of which is a provision whereby any item may be caused at will to select immediately and automatically another. This is the essential feature of the memex. The process of tying two items together is the important thing.
当用户构建路径时,他会为其命名,将名称插入其密码本中,然后在键盘上敲击它。在他面前是要连接的两个项目,投影到相邻的观看位置。每个项目的底部都有许多空白代码空间,并且设置了一个指针来指示每个项目上的其中一个。用户点击一个键,项目就会永久连接。在每个代码空间中出现代码字。在代码空间之外的视图之外,插入了一组用于光电管查看的点;在每个项目上,这些点按其位置指定其他项目的索引号。
When the user is building a trail, he names it, inserts the name in his code book, and taps it out on his keyboard. Before him are the two items to be joined, projected onto adjacent viewing positions. At the bottom of each there are a number of blank code spaces, and a pointer is set to indicate one of these on each item. The user taps a single key, and the items are permanently joined. In each code space appears the code word. Out of view, but also in the code space, is inserted a set of dots for photocell viewing; and on each item these dots by their positions designate the index number of the other item.
此后,在任何时候,当这些项目之一出现在视图中时,只需点击相应代码空间下方的按钮即可立即调用另一个项目。此外,当许多项目如此连接在一起形成一条轨迹时,可以通过像用于翻书页那样偏转杠杆来快速或缓慢地依次查看它们。就好像这些物理物品是从分散的来源收集在一起并装订在一起形成一本新书一样。不仅如此,任何项目都可以连接成许多路径。
Thereafter, at any time, when one of these items is in view, the other can be instantly recalled merely by tapping a button below the corresponding code space. Moreover, when numerous items have been thus joined together to form a trail, they can be reviewed in turn, rapidly or slowly, by deflecting a lever like that used for turning the pages of a book. It is exactly as though the physical items had been gathered together from widely separated sources and bound together to form a new book. It is more than this, for any item can be joined into numerous trails.
可以说,模因的所有者对弓箭的起源和特性感兴趣。具体来说,他正在研究为什么土耳其短弓在十字军东征的小冲突中明显优于英国长弓。他的模因中有数十本可能相关的书籍和文章。首先,他浏览了一本百科全书,找到一篇有趣但粗略的文章,然后将其投影出来。接下来,在历史中,他找到了另一个相关项目,并将两者联系在一起。他就这样走着,建立了一条由许多物品组成的踪迹。有时,他会插入自己的评论,要么将其链接到主路径中,要么通过旁路将其连接到特定项目。当很明显可用材料的弹性特性与弓有很大关系时,他又开始了一条小路,这带他浏览了有关弹性的教科书和表格物理常数。他插入了一页他自己的手写分析。因此,他通过他可用的材料迷宫建立了一条他的兴趣轨迹。
The owner of the memex, let us say, is interested in the origin and properties of the bow and arrow. Specifically he is studying why the short Turkish bow was apparently superior to the English long bow in the skirmishes of the Crusades. He has dozens of possibly pertinent books and articles in his memex. First he runs through an encyclopedia, finds an interesting but sketchy article, leaves it projected. Next, in a history, he finds another pertinent item, and ties the two together. Thus he goes, building a trail of many items. Occasionally he inserts a comment of his own, either linking it into the main trail or joining it by a side trail to a particular item. When it becomes evident that the elastic properties of available materials had a great deal to do with the bow, he branches off on a side trail which takes him through textbooks on elasticity and tables of physical constants. He inserts a page of longhand analysis of his own. Thus he builds a trail of his interest through the maze of materials available to him.
而他的足迹也不会消失。几年后,他与一位朋友的谈话转向了人们抵制创新的奇怪方式,即使是至关重要的创新。他有一个例子,事实上,愤怒的欧洲人仍然没有采用土耳其弓。事实上他有踪迹。触摸即可调出密码本。点击几个按键即可投射出路线的头部。一根杠杆可以随意穿过它,停在有趣的物品上,或者去进行一些短途旅行。这是一条有趣的线索,与讨论相关。因此,他设置了一个运行中的复制器,拍摄整个踪迹,并将其传递给他的朋友,以便插入到他自己的模因中,在那里链接到更一般的踪迹。
And his trails do not fade. Several years later, his talk with a friend turns to the queer ways in which a people resist innovations, even of vital interest. He has an example, in the fact that the outraged Europeans still failed to adopt the Turkish bow. In fact he has a trail on it. A touch brings up the code book. Tapping a few keys projects the head of the trail. A lever runs through it at will, stopping at interesting items, going off on side excursions. It is an interesting trail, pertinent to the discussion. So he sets a reproducer in action, photographs the whole trail out, and passes it to his friend for insertion in his own memex, there to be linked into the more general trail.
全新形式的百科全书将会出现,已经准备好,其中有贯穿其中的一系列关联路径,准备好放入模因并在那里放大。律师可以接触到他的整个经历以及朋友和当局的经历的相关意见和决定。专利律师随时待命数以百万计的已授权专利,并熟悉客户感兴趣的每一点。医生对病人的反应感到困惑,于是沿着研究早期类似病例时建立的线索,快速浏览类似的病例史,并侧面参考相关解剖学和组织学的经典著作。化学家正在努力合成一种有机化合物,他的实验室里摆满了所有的化学文献,其中有化合物类比的线索,以及它们的物理和化学行为的侧面线索。
Wholly new forms of encyclopedias will appear, ready made with a mesh of associative trails running through them, ready to be dropped into the memex and there amplified. The lawyer has at his touch the associated opinions and decisions of his whole experience, and of the experience of friends and authorities. The patent attorney has on call the millions of issued patents, with familiar trails to every point of his client’s interest. The physician, puzzled by a patient’s reactions, strikes the trail established in studying an earlier similar case, and runs rapidly through analogous case histories, with side references to the classics for the pertinent anatomy and histology. The chemist, struggling with the synthesis of an organic compound, has all the chemical literature before him in his laboratory, with trails following the analogies of compounds, and side trails to their physical and chemical behavior.
历史学家对一个民族进行了大量按时间顺序的记述,将其与只停在显着项目上的跳跃轨迹平行,并且可以随时追踪当代轨迹,这些轨迹引导他了解特定时代的文明。开拓者是一种新的职业,他们乐于通过大量的共同记录建立有用的道路。大师的遗产不仅成为他对世界记录的补充,而且对他的弟子来说,成为他们搭建的整个脚手架。
The historian, with a vast chronological account of a people, parallels it with a skip trail which stops only on the salient items, and can follow at any time contemporary trails which lead him all over civilization at a particular epoch. There is a new profession of trail blazers, those who find delight in the task of establishing useful trails through the enormous mass of the common record. The inheritance from the master becomes, not only his additions to the world’s record, but for his disciples the entire scaffolding by which they were erected.
因此,科学可以实现人类生产、存储和查阅种族记录的方式。更引人注目地勾勒出未来的工具,而不是像这里所做的那样,严格遵守现在已知并正在快速发展的方法和元素,可能会令人震惊。当然,各种各样的技术难题都被忽视了,但也忽视了那些未知的手段,这些手段可能会像热电子管的出现一样猛烈地加速技术进步。为了使图片不会太普遍,由于坚持当今的模式,最好提及这样一种可能性,不是预言,而只是暗示,因为基于已知延伸的预言具有实质内容,而建立在未知之上的预言只是一种双重猜测。
Thus science may implement the ways in which man produces, stores, and consults the record of the race. It might be striking to outline the instrumentalities of the future more spectacularly, rather than to stick closely to methods and elements now known and undergoing rapid development, as has been done here. Technical difficulties of all sorts have been ignored, certainly, but also ignored are means as yet unknown which may come any day to accelerate technical progress as violently as did the advent of the thermionic tube. In order that the picture may not be too commonplace, by reason of sticking to present-day patterns, it may be well to mention one such possibility, not to prophesy but merely to suggest, for prophecy based on extension of the known has substance, while prophecy founded on the unknown is only a doubly involved guess.
我们创造或吸收记录材料的所有步骤都是通过一种感官进行的——我们触摸按键时的触觉,我们说或听时的口头,我们阅读时的视觉。难道有一天这条路就不能建立得更直接吗?我们知道当眼睛看到时,所有随之而来的信息都会通过视神经通道中的电振动传输到大脑。这与电视机电缆中发生的电振动进行了精确的类比:它们将图像从光电管传送到无线电发射器,并从该无线电发射器进行广播。我们进一步知道,如果我们能够使用适当的仪器接近该电缆,我们就不需要触摸它;我们可以通过电感应拾取这些振动,从而发现并再现正在传输的场景,就像窃听电话线以获取其消息一样。打字员手臂神经中流动的脉冲将翻译后的信息传递到她的手指,这些信息到达她的眼睛或耳朵,以便手指可以敲击正确的键。这些电流可能不会被拦截,无论是以信息传递到大脑的原始形式,还是以它们随后传递到手的奇妙变形形式?
All our steps in creating or absorbing material of the record proceed through one of the senses—the tactile when we touch keys, the oral when we speak or listen, the visual when we read. Is it not possible that some day the path may be established more directly? We know that when the eye sees, all the consequent information is transmitted to the brain by means of electrical vibrations in the channel of the optic nerve. This is an exact analogy with the electrical vibrations which occur in the cable of a television set: they convey the picture from the photocells which see it to the radio transmitter from which it is broadcast. We know further that if we can approach that cable with the proper instruments, we do not need to touch it; we can pick up those vibrations by electrical induction and thus discover and reproduce the scene which is being transmitted, just as a telephone wire may be tapped for its message. The impulses which flow in the arm nerves of a typist convey to her fingers the translated information which reaches her eye or ear, in order that the fingers may be caused to strike the proper keys. Might not these currents be intercepted, either in the original form in which information is conveyed to the brain, or in the marvelously metamorphosed form in which they then proceed to the hand?
通过骨传导,我们已经将声音引入聋人的神经通道,以便他们能够听到声音。我们是否有可能学会引入它们,而不需要首先将电振动转换为机械振动,然后人体机制立即将其转换回电形式?通过头骨上的几个电极,脑描记器现在可以产生笔墨痕迹,这些痕迹与大脑本身发生的电现象有一定的关系。诚然,这个记录是难以理解的,除非它指出了大脑机制的某些严重故障。但现在谁会对这样的事情可能导致的结果设定界限呢?
By bone conduction we already introduce sounds: into the nerve channels of the deaf in order that they may hear. Is it not possible that we may learn to introduce them without the present cumbersomeness of first transforming electrical vibrations to mechanical ones, which the human mechanism promptly transforms back to the electrical form? With a couple of electrodes on the skull the encephalograph now produces pen-and-ink traces which bear some relation to the electrical phenomena going on in the brain itself. True, the record is unintelligible, except as it points out certain gross misfunctioning of the cerebral mechanism; but who would now place bounds on where such a thing may lead?
在外面的世界中,所有形式的智能,无论是声音还是视觉,都被简化为电路中变化电流的形式,以便可以传输。在人体内部也会发生完全相同的过程。为了从一种电现象转变为另一种电现象,我们必须始终转变为机械运动吗?这是一个有启发性的想法,但它很难保证在不脱离现实和即时性的情况下进行预测。
In the outside world, all forms of intelligence whether of sound or sight, have been reduced to the form of varying currents in an electric circuit in order that they may be transmitted. Inside the human frame exactly the same sort of process occurs. Must we always transform to mechanical movements in order to proceed from one electrical phenomenon to another? It is a suggestive thought, but it hardly warrants prediction without losing touch with reality and immediateness.
想必,如果人能更好地回顾自己的黑幕,更全面、更客观地分析自己现在的问题,他的精神应该得到提升。他已经建立了一个如此复杂的文明,如果他想要将他的实验推向合乎逻辑的结论,他就需要更充分地机械化他的记录,而不是仅仅因为过度消耗他有限的记忆而陷入半途中的困境。如果他能重新获得忘记手头不需要的各种东西的特权,并且保证如果它们很重要,他可以再次找到它们,那么他的旅行可能会更愉快。
Presumably man’s spirit should be elevated if he can better review his shady past and analyze more completely and objectively his present problems. He has built a civilization so complex that he needs to mechanize his records more fully if he is to push his experiment to its logical conclusion and not merely become bogged down part way there by overtaxing his limited memory. His excursions may be more enjoyable if he can reacquire the privilege of forgetting the manifold things he does not need to have immediately at hand, with some assurance that he can find them again if they prove important.
科学的应用为人类建造了一座供应充足的房屋,并教导人们在其中健康地生活。它们使他能够用残酷的武器让大批人互相攻击。它们可能会让他真正拥有伟大的记录,并在比赛经验的智慧中成长。在他学会利用这一记录来谋取真正的利益之前,他可能会在冲突中丧生。然而,在将科学应用于人类的需要和欲望时,似乎是一个非常不幸的阶段,在这个阶段终止这个过程,或者对结果失去希望。
The applications of science have built man a well-supplied house, and are teaching him to live healthily therein. They have enabled him to throw masses of people against one another with cruel weapons. They may yet allow him truly to encompass the great record and to grow in the wisdom of race experience. He may perish in conflict before he learns to wield that record for his true good. Yet, in the application of science to the needs and desires of man, it would seem to be a singularly unfortunate stage at which to terminate the process, or to lose hope as to the outcome.
经《大西洋月刊》内容机构许可,转载自 Bush (1945a) 。
Reprinted from Bush (1945a), with permission from Tribune Content Agency—The Atlantic.
在写完重要的硕士论文(第 8 章)后,克劳德·香农 (Claude Shannon) 开发了一种用于孟德尔遗传学计算的代数,并于 1940 年将其作为博士论文提交。随后,他转到贝尔实验室,并在这篇论文中发明了信息论。二进制记数法的优点众所周知;EDVAC 团队已决定使用二进制来存储所有内部存储和算术。《初稿》明确指出,“存储器的(容量)单位是保留一个二进制数字的值的能力”(本卷第103页)。香农将这个单位称为“比特”,这一著名说法是由数学家约翰·图基发明的。然后,香农提出了当并非所有消息都可能或同样可能时如何用相对较少的比特数对长消息进行编码的问题。这项工作不仅是数据压缩理论及其限制(霍夫曼编码,例如[Huffman,1952])的基础,而且是当消息位在从源到目的地的传输过程中可能被损坏时重建消息的重要理论的基础(见第 13 章)。现在常用的术语“熵”是从统计物理学借用的,用来表示源信息内容的度量。
After writing his important Master’s thesis (chapter 8), Claude Shannon developed an algebra for computations in Mendelian genetics, submitting it as his PhD dissertation in 1940. He then moved to Bell Labs and, in this paper, invented information theory. The advantages of binary notation were already known; the EDVAC team had settled on binary for all internal storage and arithmetic. The “First Draft” states unequivocally, “The (capacity) unit of memory is the ability to retain the value of one binary digit” (page 103 of this volume). Shannon famously dubs that unit the “bit,” though attributes the coinage to mathematician John Tukey. Shannon then pursues the question of how to encode long messages with a relatively small number of bits when not all messages may be possible or equally likely. This work is the basis not only for the theory of data compression and its limits (Huffman codes, for example [Huffman, 1952]) but for the important theory of reconstruction of messages when their bits may be corrupted in transit from source to destination (see chapter 13). The term “entropy,” now in common usage, is here borrowed from statistical physics to represent a measure of the information content of a source.
香农后来出版了第三部杰作《保密系统的通信理论》(Shannon,1949),从信息论的角度阐述了密码学的一些原理。这是基于他根据贝尔实验室和美国军方之间的合同所做的机密工作。香农于 1958 年回到麻省理工学院担任教授,长期以来一直是个顽皮的修补匠。他杂耍、骑独轮车,有时两者同时进行。他建造了一台机器,可以对机械鼠标进行杂耍和编程,让它学习走出迷宫的路径。
Shannon later published a third masterpiece, “Communication Theory of Secrecy Systems” (Shannon, 1949), which laid out some of the principles of cryptography from an information-theoretic viewpoint. It was based on classified work he had done under a contract between Bell Labs and the U.S. military. Shannon returned to MIT as a professor in 1958 and long remained the playful tinkerer. He juggled, rode a unicycle, and sometimes did both simultaneously. He built a machine that juggled and programmed a mechanical mouse to learn its path out of a maze.
他最终在数学方面的成就下降,并停止出版和教学,最终于 2001 年因阿尔茨海默病去世,享年 84 岁。
He eventually became less prolific mathematically and stopped publishing and teaching, finally dying in 2001 of Alzheimer’s disease at the age of 84.
最近各种调制方法的发展,例如用带宽换取信噪比的 PCM 和 PPM,增强了人们对一般通信理论的兴趣。[编辑:脉冲编码调制和脉冲位置调制。] 这种理论的基础包含在 Nyquist (1924, 1928) 和 Hartley (1928) 关于这个主题的重要论文中。在本文中,我们将扩展该理论以包括许多新因素,特别是信道中噪声的影响,以及由于原始消息的统计结构和信息最终目的地的性质而可能实现的节省。
THE recent development of various methods of modulation such as PCM and PPM which exchange bandwidth for signal-to-noise ratio has intensified the interest in a general theory of communication. [EDITOR: Pulse Code Modulation and Pulse Position Modulation.] A basis for such a theory is contained in the important papers of Nyquist (1924, 1928) and Hartley (1928) on this subject. In the present paper we will extend the theory to include a number of new factors, in particular the effect of noise in the channel, and the savings possible due to the statistical structure of the original message and due to the nature of the final destination of the information.
通信的基本问题是在一个点精确地或近似地再现在另一点选择的消息。这些信息通常是有意义的;也就是说,它们指的是某些系统,或者根据某些系统与某些物理或概念实体相关。通信的这些语义方面与工程问题无关。重要的方面是实际消息是从一组可能的消息中选择的。系统必须设计为针对每种可能的选择进行操作,而不仅仅是实际选择的选择,因为这在设计时是未知的。
The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. The significant aspect is that the actual message is one selected from a set of possible messages. The system must be designed to operate for each possible selection, not just the one which will actually be chosen since this is unknown at the time of design.
如果集合中的消息数量是有限的,则该数量或该数量的任何单调函数可以被视为当从集合中选择一条消息时产生的信息的度量,所有选择都是同等可能的。正如哈特利所指出的,最自然的选择是对数函数。尽管当我们考虑消息统计的影响并且当我们有连续范围的消息时,这个定义必须相当普遍,但在所有情况下我们都会使用本质上对数的度量。
If the number of messages in the set is finite then this number or any monotonic function of this number can be regarded as a measure of the information produced when one message is chosen from the set, all choices being equally likely. As was pointed out by Hartley the most natural choice is the logarithmic function. Although this definition must be generalized considerably when we consider the influence of the statistics of the message and when we have a continuous range of messages, we will in all cases use an essentially logarithmic measure.
由于多种原因,对数测量更方便:
The logarithmic measure is more convenient for various reasons:
1.它更实用。工程重要性参数,例如时间、带宽、中继数量等,往往随着可能性数量的对数线性变化。例如,将一个继电器添加到组中会使继电器的可能状态数量加倍。它将这个数字的以 2 为底的对数加 1。将时间加倍大致等于可能消息数量的平方,或将对数加倍等。
1. It is practically more useful. Parameters of engineering importance such as time, bandwidth, number of relays, etc., tend to vary linearly with the logarithm of the number of possibilities. For example, adding one relay to a group doubles the number of possible states of the relays. It adds 1 to the base 2 logarithm of this number. Doubling the time roughly squares the number of possible messages, or doubles the logarithm, etc.
2.对于正确的衡量标准更接近我们的直觉。这与(1)密切相关,因为我们通过与通用标准的线性比较来直观地衡量实体。例如,人们认为两张打孔卡的信息存储容量应是一张的两倍,两个相同的通道的信息传输容量应是一张的两倍。
2. It is nearer to our intuitive feeling as to the proper measure. This is closely related to (1) since we intuitively measure entities by linear comparison with common standards. One feels, for example, that two punched cards should have twice the capacity of one for information storage, and two identical channels twice the capacity of one for transmitting information.
3、数学上更合适。许多限制运算在对数方面很简单,但在可能性数量方面需要笨拙的重述。
3. It is mathematically more suitable. Many of the limiting operations are simple in terms of the logarithm but would require clumsy restatement in terms of the number of possibilities.
对数底的选择对应于测量信息的单位的选择。如果使用基数 2,则所得单位可以称为二进制数字,或更简单地称为“位”,这是 JW Tukey 建议的一个词。具有两个稳定位置的设备,例如继电器或触发器电路,可以存储一位信息。N个这样的设备可以存储N位,因为可能状态的总数为 2 N且 log 2 2 N = N。如果使用以 10 为基数的单位,则可称为十进制数字。自从
The choice of a logarithmic base corresponds to the choice of a unit for measuring information. If the base 2 is used the resulting units may be called binary digits, or more briefly bits, a word suggested by J. W. Tukey. A device with two stable positions, such as a relay or a flip-flop circuit, can store one bit of information. N such devices can store N bits, since the total number of possible states is 2N and log2 2N = N. If the base 10 is used the units may be called decimal digits. Since
十进制数字是关于位的。台式计算机上的数字轮有十个稳定位置,因此具有一位十进制数字的存储容量。在分析工作中,整合和微分有关,基数e有时很有用。由此产生的信息单位称为自然单位。从a基数变为b基数只需乘以 log b a。
a decimal digit is about bits. A digit wheel on a desk computing machine has ten stable positions and therefore has a storage capacity of one decimal digit. In analytical work where integration and differentiation are involved the base e is sometimes useful. The resulting units of information will be called natural units. Change from the base a to base b merely requires multiplication by logb a.
我们所说的通信系统是指图 12.1中示意性指示的类型的系统。它主要由五个部分组成:
By a communication system we will mean a system of the type indicated schematically in Figure 12.1. It consists of essentially five parts:
1.产生要传送到接收终端的消息或消息序列的信息源。该消息可以是各种类型: (a) 字母序列,如电传打字机系统的电报;(b)无线电或电话中的时间f ( t )的单一函数;(c) 时间和其他变量的函数,如黑白电视——这里的消息可以被认为是两个空间坐标和时间的函数 f ( x , y, t ),点 ( x, y, t )处的光强度y ) 和时间t在拾取管板上;(d) 两个或多个时间函数,例如f ( t )、g ( t )、h ( t )——这是“三维”声音传输的情况,或者如果系统打算为多个单独的通道提供服务多路传输;(e) 多个变量的多个函数——在彩色电视中,消息由定义在三维空间中的三个函数f ( x, y, t )、g ( x, y, t )、h ( x, y, t ) 组成。连续体——我们也可以将这三个函数视为该区域中定义的矢量场的组成部分——类似地,几个黑白电视源将产生由三个变量的多个函数组成的“消息”;(f) 还存在各种组合,例如在具有相关音频通道的电视中。
1. An information source which produces a message or sequence of messages to be communicated to the receiving terminal. The message may be of various types: (a) A sequence of letters as in a telegraph of teletype system; (b) A single function of time f(t) as in radio or telephony; (c) A function of time and other variables as in black and white television—here the message may be thought of as a function f(x, y, t) of two space coordinates and time, the light intensity at point (x, y) and time t on a pickup tube plate; (d) Two or more functions of time, say f(t), g(t), h(t)—this is the case in “three-dimensional” sound transmission or if the system is intended to service several individual channels in multiplex; (e) Several functions of several variables—in color television the message consists of three functions f(x, y, t), g(x, y, t), h(x, y, t) defined in a three-dimensional continuum—we may also think of these three functions as components of a vector field defined in the region—similarly, several black and white television sources would produce “messages” consisting of a number of functions of three variables; (f) Various combinations also occur, for example in television with an associated audio channel.
2.发射机,以某种方式对消息进行操作,产生适合在信道上传输的信号。在电话中,该操作仅包括将声压改变为成比例的电流。在电报中,我们有一个编码操作,它在与消息相对应的通道上产生一系列点、划线和空格。在多路 PCM 系统中,必须对不同的语音函数进行采样、压缩、量化和编码,最后正确地交织以构建信号。声码器系统、电视和频率调制是应用于消息以获得信号的复杂操作的其他示例。
2. A transmitter which operates on the message in some way to produce a signal suitable for transmission over the channel. In telephony this operation consists merely of changing sound pressure into a proportional electrical current. In telegraphy we have an encoding operation which produces a sequence of dots, dashes and spaces on the channel corresponding to the message. In a multiplex PCM system the different speech functions must be sampled, compressed, quantized and encoded, and finally interleaved properly to construct the signal. Vocoder systems, television and frequency modulation are other examples of complex operations applied to the message to obtain the signal.
3.信道只是用于将信号从发射器传输到接收器的介质。它可以是一对电线、一根同轴电缆、一段射频、一束光束等。
3. The channel is merely the medium used to transmit the signal from transmitter to receiver. It may be a pair of wires, a coaxial cable, a band of radio frequencies, a beam of light, etc.
4.接收器通常执行与发送器所执行的操作相反的操作,从信号中重建消息。
4. The receiver ordinarily performs the inverse operation of that done by the transmitter, reconstructing the message from the signal.
5.目的地是消息所针对的人(或事物)。
5. The destination is the person (or thing) for whom the message is intended.
图 12.1: 一般通信系统的示意图。
Figure 12.1: Schematic diagram of a general communication system.
我们希望考虑涉及通信系统的某些一般问题。为此,首先必须将涉及的各种元素表示为数学实体,并根据其物理对应物进行适当理想化。我们可以粗略地将通信系统分为三大类:离散型、连续型和混合型。我们所说的离散系统是指消息和信号都是离散符号序列的系统。一个典型的例子是电报,其中消息是字母序列,信号是点、破折号和空格序列。连续系统是一种消息和信号都被视为连续函数的系统,例如广播或电视。混合系统是一种同时出现离散变量和连续变量的系统,例如语音的 PCM 传输。
We wish to consider certain general problems involving communication systems. To do this it is first necessary to represent the various elements involved as mathematical entities, suitably idealized from their physical counterparts. We may roughly classify communication systems into three main categories: discrete, continuous and mixed. By a discrete system we will mean one in which both the message and the signal are a sequence of discrete symbols. A typical case is telegraphy where the message is a sequence of letters and the signal a sequence of dots, dashes and spaces. A continuous system is one in which the message and signal are both treated as continuous functions, e.g., radio or television. A mixed system is one in which both discrete and continuous variables appear, e.g., PCM transmission of speech.
我们首先考虑离散情况。该案例不仅在通信理论中具有应用,而且在计算机器理论、电话交换机设计等领域也有应用。此外,离散案例为连续案例和混合案例奠定了基础,连续案例和混合案例将在本文的后半部分进行处理。
We first consider the discrete case. This case has applications not only in communication theory, but also in the theory of computing machines, the design of telephone exchanges and other fields. In addition the discrete case forms a foundation for the continuous and mixed cases which will be treated in the second half of the paper.
电传打字机和电报是用于传输信息的离散通道的两个简单示例。一般而言,离散信道意味着一种系统,通过该系统可以将来自有限组基本符号S 1 , … , S n的选择序列从一个点传输到另一个点。假定每个符号S i具有一定的持续时间t i秒(对于不同的S i不一定相同,例如电报中的点和划)。不要求S i所有可能的序列都能够在系统上传输;只允许某些序列。这些将是该通道可能的信号。因此,在电报中,假设符号是: (1) 一个点,由一个时间单位的线路闭合和一个时间单位的线路打开组成;(2) 破折号,由三个闭合时间单位和一个打开时间单位组成;(3) 一个字母空间,例如由三个单位的行组成;(4) 六个单位行的字空间。我们可以对允许的序列进行限制,即彼此之间没有空格(因为如果两个字母空格相邻,则它与单词空格相同)。我们现在考虑的问题是如何测量这一通道传输信息的能力。
Teletype and telegraphy are two simple examples of a discrete channel for transmitting information. Generally, a discrete channel will mean a system whereby a sequence of choices from a finite set of elementary symbols S1, …, Sn can be transmitted from one point to another. Each of the symbols Si is assumed to have a certain duration in time ti seconds (not necessarily the same for different Si, for example the dots and dashes in telegraphy). It is not required that all possible sequences of the Si be capable of transmission on the system; certain sequences only may be allowed. These will be possible signals for the channel. Thus in telegraphy suppose the symbols are: (1) A dot, consisting of line closure for a unit of time and then line open for a unit of time; (2) A dash, consisting of three time units of closure and one unit open; (3) A letter space consisting of, say, three units of line open; (4) A word space of six units of line open. We might place the restriction on allowable sequences that no spaces follow each other (for if two letter spaces are adjacent, it is identical with a word space). The question we now consider is how one can measure the capacity of such a channel to transmit information.
在电传打字机的情况下,所有符号都具有相同的持续时间,并且允许 32 个符号的任何序列,答案很容易。每个符号代表五位信息。如果系统每秒传输n 个符号,则自然可以说该信道的容量为每秒5 n位。这并不意味着电传打字机通道将始终以此速率传输信息 - 这是最大可能的速率,实际速率是否达到此最大值取决于馈送通道的信息源,同样,稍后出现。在更一般的情况下,具有不同长度的符号和对允许序列的约束,我们做出以下定义:
In the teletype case where all symbols are of the same duration, and any sequence of the 32 symbols is allowed the answer is easy. Each symbol represents five bits of information. If the system transmits n symbols per second it is natural to say that the channel has a capacity of 5n bits per second. This does not mean that the teletype channel will always be transmitting information at this rate—this is the maximum possible rate and whether or not the actual rate reaches this maximum depends on the source of information which feeds the channel, as will appear later. In the more general case with different lengths of symbols and constraints on the allowed sequences, we make the following definition:
定义:离散通道的容量 C 由下式给出
Definition: The capacity C of a discrete channel is given by
其中 N ( T )是持续时间为 T 的允许信号数量。
where N(T) is the number of allowed signals of duration T.
很容易看出,在电传打字机的情况下,这会简化为先前的结果。可以证明,在大多数感兴趣的情况下,所讨论的极限将作为有限数存在。假设允许符号S 1 , … , S n的所有序列并且这些符号具有持续时间t 1 , … , t n。通道容量是多少?如果N ( t ) 表示持续时间为t的序列数,我们有
It is easily seen that in the teletype case this reduces to the previous result. It can be shown that the limit in question will exist as a finite number in most cases of interest. Suppose all sequences of the symbols S1, …, Sn are allowed and these symbols have durations t1, …, tn. What is the channel capacity? If N(t) represents the number of sequences of duration t we have
总数等于以S 1 , S 2 , … , S n结尾的序列数之和,分别为N ( t − t 1 ), N ( t − t 2 ), … , N ( t − t n),分别。根据有限差分的众所周知的结果,对于大t , N ( t ) 渐近到其中X 0是特征方程X − t 1 + X − t 2 + ⋯ + X − t n的最大实数解= 1,因此C = log X 0。
The total number is equal to the sum of the numbers of sequences ending in S1, S2, …, Sn and these are N(t − t1), N(t − t2), …, N(t − tn), respectively. According to a well-known result in finite differences, N(t) is then asymptotic for large t to where X0 is the largest real solution of the characteristic equation X−t1 + X−t2 + ⋯ + X−tn = 1, and therefore C = log X0.
如果允许的序列有限制,我们仍然可以经常获得这种类型的差分方程,并从特征方程中找到C。在上述电报案中
In case there are restrictions on allowed sequences we may still often obtain a difference equation of this type and find C from the characteristic equation. In the telegraphy case mentioned above
正如我们通过根据最后一个或紧邻最后一个出现的符号对符号序列进行计数所看到的。因此,C为 − log μ 0,其中μ 0是 1 = μ 2 + μ 4 + μ 5 + μ 7 + μ 8 + μ 10的正根。解决这个问题我们发现C = 0.539。
as we see by counting sequences of symbols according to the last or next to the last symbol occurring. Hence C is − log μ0 where μ0 is the positive root of 1 = μ2 + μ4 + μ5 + μ7 + μ8 + μ10. Solving this we find C = 0.539.
可以对允许的序列施加的一种非常通用的限制类型如下:我们想象许多可能的状态a 1 , a 2 , … , a m。对于每个状态,只能传输集合S 1 , … , S n中的某些符号(不同状态的不同子集)。当其中之一已被传输时,状态改变为新状态,这取决于旧状态和所传输的特定符号。电报机外壳就是一个简单的例子。根据空格是否是最后传输的符号,有两种状态。如果是这样,那么接下来只能发送一个点或一个破折号,并且状态总是会改变。如果不是,则可以传输任何符号,并且如果发送空格则状态改变,否则保持不变。这些条件可以用线性图表示,如图12.2所示。连接点对应于状态,线条表示状态中可能的符号和结果状态。在附录1[编辑:省略]中表明,如果允许序列的条件可以用这种形式描述,则C将存在并且可以根据以下结果进行计算:
A very general type of restriction which may be placed on allowed sequences is the following: We imagine a number of possible states a1, a2, …, am. For each state only certain symbols from the set S1, …, Sn can be transmitted (different subsets for the different states). When one of these has been transmitted the state changes to a new state depending both on the old state and the particular symbol transmitted. The telegraph case is a simple example of this. There are two states depending on whether or not a space was the last symbol transmitted. If so, then only a dot or a dash can be sent next and the state always changes. If not, any symbol can be transmitted and the state changes if a space is sent, otherwise it remains the same. The conditions can be indicated in a linear graph as shown in Figure 12.2. The junction points correspond to the states and the lines indicate the symbols possible in a state and the resulting state. In Appendix 1 [EDITOR: omitted] it is shown that if the conditions on allowed sequences can be described in this form C will exist and can be calculated in accordance with the following result:
图 12.2: 电报符号约束的图形表示。
Figure 12.2: Graphical representation of the constraints on telegraph symbols.
定理 1:令 为 状态 i 中允许并导致状态 j 的第 s个符号 的持续时间。那么信道容量 C 等于log W,其中 W 是信道容量的最大实根 行列式:
Theorem 1: Let be the duration of the sth symbol which is allowable in state i and leads to state j. Then the channel capacity C is equal to log W where W is the largest real root of the determinant equation:
其中,如果 i = j,则δ ij = 1 ,否则为零。
where δij = 1 if i = j and is zero otherwise.
例如,在电报情况下(图 12.2),行列式是:
For example, in the telegraph case (Figure 12.2) the determinant is:
在展开时,这导致了上面针对这种情况给出的方程。
On expansion this leads to the equation given above for this case.
我们已经看到,在非常一般的条件下,离散通道中可能信号数量的对数随时间线性增加。传输信息的容量可以通过给出该增长率来指定,即指定所使用的特定信号所需的每秒位数。我们现在考虑信息源。如何用数学方法描述信息源,以及给定源每秒产生多少信息(以比特为单位)?争论的要点是,通过使用正确的信息编码,关于源的统计知识对减少信道所需容量的影响。例如,在电报中,要传输的消息由字母序列组成。然而,这些序列并不是完全随机的。一般来说,它们形成句子并具有英语等的统计结构。字母E比Q更频繁地出现,序列TH比XP更频繁地出现,等等。这种结构的存在允许人们通过将消息序列正确地编码成信号序列来节省时间(或信道容量)。在电报中,通过使用最短的通道符号(点)来表示最常见的英文字母 E,已经在一定程度上做到了这一点;而不常见的字母 Q、X、Z 则由较长的点和破折号序列表示。这种想法在某些商业代码中得到了进一步的体现,其中常见的单词和短语由四个或五个字母的代码组表示,从而平均时间节省了相当多。现在使用的标准化问候语和周年纪念电报将其扩展为将一两句话编码为相对较短的数字序列。
We have seen that under very general conditions the logarithm of the number of possible signals in a discrete channel increases linearly with time. The capacity to transmit information can be specified by giving this rate of increase, the number of bits per second required to specify the particular signal used. We now consider the information source. How is an information source to be described mathematically, and how much information in bits per second is produced in a given source? The main point at issue is the effect of statistical knowledge about the source in reducing the required capacity of the channel, by the use of proper encoding of the information. In telegraphy, for example, the messages to be transmitted consist of sequences of letters. These sequences, however, are not completely random. In general, they form sentences and have the statistical structure of, say, English. The letter E occurs more frequently than Q, the sequence TH more frequently than XP, etc. The existence of this structure allows one to make a saving in time (or channel capacity) by properly encoding the message sequences into signal sequences. This is already done to a limited extent in telegraphy by using the shortest channel symbol, a dot, for the most common English letter E; while the infrequent letters, Q, X, Z are represented by longer sequences of dots and dashes. This idea is carried still further in certain commercial codes where common words and phrases are represented by four- or five-letter code groups with a considerable saving in average time. The standardized greeting and anniversary telegrams now in use extend this to the point of encoding a sentence or two into a relatively short sequence of numbers.
我们可以将离散源视为一个符号一个符号地生成消息。它将根据某些概率选择连续的符号,通常取决于先前的选择以及所讨论的特定符号。产生由一组概率控制的符号序列的物理系统或系统的数学模型被称为随机过程(例如,参见 Chandrasekhar [1943])。因此,我们可以考虑用随机过程来表示离散源。相反,任何随机过程产生从有限集合中选择的离散符号序列可以被认为是离散源。这将包括以下情况:
We can think of a discrete source as generating the message, symbol by symbol. It will choose successive symbols according to certain probabilities depending, in general, on preceding choices as well as the particular symbols in question. A physical system, or a mathematical model of a system which produces such a sequence of symbols governed by a set of probabilities, is known as a stochastic process (see, for example, Chandrasekhar [1943]). We may consider a discrete source, therefore, to be represented by a stochastic process. Conversely, any stochastic process which produces a discrete sequence of symbols chosen from a finite set may be considered a discrete source. This will include such cases as:
1.自然书面语言,如英语、德语、汉语。
1. Natural written languages such as English, German, Chinese.
2. 通过某种量化过程而变得离散的连续信息源。例如,来自 PCM 发射机的量化语音或量化电视信号。
2. Continuous information sources that have been rendered discrete by some quantizing process. For example, the quantized speech from a PCM transmitter, or a quantized television signal.
3. 数学案例,我们仅仅抽象地定义一个生成符号序列的随机过程。以下是最后一种来源的示例。
3. Mathematical cases where we merely define abstractly a stochastic process which generates a sequence of symbols. The following are examples of this last type of source.
(A) 假设我们有五个字母 A、B、C、D、E,每个字母都有概率被选择。2,连续的选择是独立的。这将导致一个序列,下面是一个典型的例子。
(A) Suppose we have five letters A, B, C, D, E which are chosen each with probability.2, successive choices being independent. This would lead to a sequence of which the following is a typical example.
BCBCCCCCADCBDDAAECEE A
B D C B C E C C C A D C B D D A A E C E E A
ABBDAEECACEEBAEECBCEA D.
A B B D A E E C A C E E B A E E C B C E A D.
这是使用随机数表构建的(Kendall,1939)。
This was constructed with the use of a table of random numbers (Kendall, 1939).
(B) 使用相同的五个字母,概率分别为 .4,.1,.2,.2,.1,连续选择独立。来自该来源的典型消息是:
(B) Using the same five letters let the probabilities be.4,.1,.2,.2,.1, respectively, with successive choices independent. A typical message from this source is then:
AAACDCBDCEAADADACEDA
A A A C D C B D C E A A D A D A C E D A
EADCABEDADDCECAAAAAA D.
E A D C A B E D A D D C E C A A A A A D.
(C) 如果连续的符号不是独立选择的而是它们的概率取决于前面的字母,则会获得更复杂的结构。在这种类型的最简单的情况下,选择仅取决于前面的字母,而不取决于之前的字母。然后可以通过一组转移概率p i ( j ) 来描述统计结构,即字母i后面跟着字母j的概率。索引i和j涵盖所有可能的符号。指定结构的第二种等效方法是给出“二元图”概率p ( i, j ),即二元图ij的相对频率。字母频率p ( i ) (字母i的概率)、转移概率p i ( j ) 和二元词概率p ( i, j ) 由以下公式关联:
(C) A more complicated structure is obtained if successive symbols are not chosen independently but their probabilities depend on preceding letters. In the simplest case of this type a choice depends only on the preceding letter and not on ones before that. The statistical structure can then be described by a set of transition probabilities pi(j), the probability that letter i is followed by letter j. The indices i andj range over all the possible symbols. A second equivalent way of specifying the structure is to give the “digram” probabilities p(i, j), i.e., the relative frequency of the digram ij. The letter frequencies p(i) (the probability of letter i), the transition probabilities pi(j) and the digram probabilities p(i, j) are related by the following formulas:
作为一个具体的例子,假设有三个字母 A、B、C 和概率表:
As a specific example suppose there are three letters A, B, C with the probability tables:
来自该来源的典型消息如下:
A typical message from this source is the following:
ABBABABABABABABBBBABBB BBABABABABABBBACACABB ABBBBABBABACBBBAB A.
A B B A B A B A B A B A B A B B B A B B B B B A B A B A B A B A B B B A C A C A B B A B B B B A B B A B A C B B B A B A.
下一次复杂性的增加将涉及三元词频率,但仅此而已。字母的选择将取决于前面的两个字母,但不取决于该点之前的消息。需要一组三元词频率p ( i, j, k ) 或等效的一组转移概率p ij ( k )。继续以这种方式,人们会逐渐获得更复杂的随机过程。在一般n元语法情况下,需要一组n元语法概率p ( i 1 , i 2 , … , i n ) 或转移概率p i 1 , i 2 , … , i n −1 ( i n )指定统计结构。
The next increase in complexity would involve trigram frequencies but no more. The choice of a letter would depend on the preceding two letters but not on the message before that point. A set of trigram frequencies p(i, j, k) or equivalently a set of transition probabilities pij(k) would be required. Continuing in this way one obtains successively more complicated stochastic processes. In the general n-gram case a set of n-gram probabilities p(i1, i2, …, in) or of transition probabilities pi1, i2, …, in−1(in) is required to specify the statistical structure.
(D) 随机过程也可以定义为产生由n 个“单词”序列组成的文本。假设该语言中有 5 个字母 A、B、C、D、E 和 16 个“单词”,其相关概率为:
(D) Stochastic processes can also be defined which produce a text consisting of a sequence of n “words.” Suppose there are five letters A, B, C, D, E and 16 “words” in the language with associated probabilities:
.10A .10 A |
.16 贝贝 .16 BEBE |
.11 驾驶室 .11 CABED |
.04DEB .04 DEB |
.04 阿德布 .04 ADEB |
.04床 .04 BED |
.05CEED .05 CEED |
.15 契约 .15 DEED |
.05 阿迪 .05 ADEE |
.02 贝德 .02 BEED |
.08 民建联 .08 DAB |
.01 EAB .01 EAB |
.01 坏 .01 BADD |
.05CA .05 CA |
.04 爸爸 .04 DAD |
.05 能源效率 .05 EE |
假设连续的“单词”是独立选择的并且由空格分隔。典型的消息可能是:
Suppose successive “words” are chosen independently and are separated by a space. A typical message might be:
DAB EE A BEBE 契约 DEB ADEE ADEE EE DEB BEBE BEBE BEBE ADEE 床契约契约 CEED ADEE 契约契约 BEBE CABED BEBE BED DAB 契约 ADEB。
DAB EE A BEBE DEED DEB ADEE ADEE EE DEB BEBE BEBE BEBE ADEE BED DEED DEED CEED ADEE A DEED DEED BEBE CABED BEBE BED DAB DEED ADEB.
如果所有单词的长度都是有限的,则该过程相当于前一种类型,但描述在单词结构和概率方面可能更简单。我们也可以在这里进行概括,引入单词之间的转移概率等。
If all the words are of finite length this process is equivalent to one of the preceding type, but the description may be simpler in terms of the word structure and probabilities. We may also generalize here and introduce transition probabilities between words, etc.
这些人工语言可用于构建简单的问题和示例来说明各种可能性。我们还可以通过一系列简单的人工语言来近似自然语言。通过以相同的概率独立地选择所有字母来获得零阶近似。一阶近似是通过独立选择连续字母而获得的,但每个字母具有与自然语言中相同的概率。(字母、二元词和三元词的频率在 Pratt [1939] 中给出。WordDewey [1923] 中列出了频率。)因此,在英语的一阶近似中,E 的选择概率为 12(其在普通英语中的频率),W 的选择概率为 02,但相邻字母之间没有影响并且没有形成TH、ED等首选二元图的倾向。在二阶近似中,引入了二元图结构。选择一个字母后,根据各个字母跟随第一个字母的频率选择下一个字母。这需要一个二字母频率表p i ( j )。在三阶近似中,引入了三元组结构。每个字母的选择概率取决于前两个字母。
These artificial languages are useful in constructing simple problems and examples to illustrate various possibilities. We can also approximate to a natural language by means of a series of simple artificial languages. The zero-order approximation is obtained by choosing all letters with the same probability and independently. The first-order approximation is obtained by choosing successive letters independently but each letter having the same probability that it has in the natural language. (Letter, digram, and trigram frequencies are given in Pratt [1939]. Word frequencies are tabulated in Dewey [1923].) Thus, in the first-order approximation to English, E is chosen with probability.12 (its frequency in normal English) and W with probability.02, but there is no influence between adjacent letters and no tendency to form the preferred digrams such as TH, ED, etc. In the second-order approximation, digram structure is introduced. After a letter is chosen, the next one is chosen in accordance with the frequencies with which the various letters follow the first one. This requires a table of digram frequencies pi(j). In the third-order approximation, trigram structure is introduced. Each letter is chosen with probabilities which depend on the preceding two letters.
为了直观地了解这一系列过程如何接近一种语言,我们构建了近似英语的典型序列,并在下面给出。在所有情况下,我们都假设一个由 27 个符号组成的“字母表”,即 26 个字母和一个空格。
To give a visual idea of how this series of processes approaches a language, typical sequences in the approximations to English have been constructed and are given below. In all cases we have assumed a 27-symbol “alphabet,” the 26 letters and a space.
1. 零阶近似(符号独立且等概率)。
1. Zero-order approximation (symbols independent and equiprobable).
XFOML RXKHRJFFJUJ ZLPWCFWKCYJ FFJEYVKCQSGHYD QPAAMKBZAACIBZLHJQD。
XFOML RXKHRJFFJUJ ZLPWCFWKCYJ FFJEYVKCQSGHYD QPAAMKBZAACIBZLHJQD.
2. 一阶近似(符号独立但具有英文文本的频率)。
2. First-order approximation (symbols independent but with frequencies of English text).
OCRO HLI RGWR NMIELWIS EU LL NBNESEBYA THE EEI ALHENHTTPA OOBTTVA NAH BRL。
OCRO HLI RGWR NMIELWIS EU LL NBNESEBYA TH EEI ALHENHTTPA OOBTTVA NAH BRL.
3. 二阶近似(英语中的二元结构)。
3. Second-order approximation (digram structure as in English).
在 IE ANTSOUTINYS 上,您将看到 TEASONARE FUSO TIZIN ANDY TOBE SEACE CTISBE 的 DEAMY ACHIN D ILONASIVE TUCOOWE。
ON IE ANTSOUTINYS ARE T INCTORE ST BE S DEAMY ACHIN D ILONASIVE TUCOOWE AT TEASONARE FUSO TIZIN ANDY TOBE SEACE CTISBE.
4. 三阶近似(英语中的三字结构)。
4. Third-order approximation (trigram structure as in English).
在没有乳清乳清干酪 FROURE BIRS GROCID PONDENOME 的 REPTAGIN 的演示是 CRE 的 REGOACTIONA。
IN NO IST LAT WHEY CRATICT FROURE BIRS GROCID PONDENOME OF DEMONSTURES OF THE REPTAGIN IS REGOACTIONA OF CRE.
5.一阶词近似。此时跳到单词单元更容易、更好,而不是继续使用四元组、...、n元组结构。这里的单词是独立选择的,但具有适当的频率(Dewey,1923)。
5. First-order word approximation. Rather than continue with tetragram, …, n-gram structure it is easier and better to jump at this point to word units. Here words are chosen independently but with their appropriate frequencies (Dewey, 1923).
代表和迅速是一个很好的选择,或者可以自然地在这里他的A来了专家格雷来提供线路消息已经是这些。
REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME CAN DIFFERENT NATURAL HERE HE THE A IN CAME THE TO OF TO EXPERT GRAY COME TO FURNISHES THE LINE MESSAGE HAD BE THESE.
6.二阶词近似。单词转换概率是正确的,但未包含进一步的结构。
6. Second-order word approximation. The word transition probabilities are correct but no further structure is included.
一位英国作家的头部和正面攻击表明,这一点的特征因此是字母的另一种方法,该字母的时间曾经告诉过一个意外的问题。
THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.
在上述每个步骤中,与普通英语文本的相似性都显着增加。请注意,这些样本具有相当良好的结构,大约是其构造中考虑的范围的两倍。因此,在(3)中,统计过程确保了两个字母序列的合理文本,但样本中的四个字母序列通常可以拟合成好的句子。在(6)中,四个或更多单词的序列可以轻松地放入句子中,而无需异常或紧张的结构。十个词的特定顺序“攻击一位英国作家的性格”并不是没有道理的。看来,足够复杂的随机过程将给出离散源的令人满意的表示。
The resemblance to ordinary English text increases quite noticeably at each of the above steps. Note that these samples have reasonably good structure out to about twice the range that is taken into account in their construction. Thus in (3) the statistical process insures reasonable text for two-letter sequences, but four-letter sequences from the sample can usually be fitted into good sentences. In (6) sequences of four or more words can easily be placed in sentences without unusual or strained constructions. The particular sequence of ten words “attack on an English writer that the character of this” is not at all unreasonable. It appears then that a sufficiently complex stochastic process will give a satisfactory representation of a discrete source.
前两个样本是通过使用随机数簿和(例如 2)字母频率表来构建的。该方法可能会继续用于(3)、(4)和(5),因为二元词、三元词和词频表可用,但使用了更简单的等效方法。例如,为了构建(3),人们随机打开一本书并随机选择页面上的一个字母。这封信被记录下来。然后将书打开到另一页并阅读,直到遇到这封信。然后记录下一个字母。翻到另一页,搜索第二个字母并记录下一个字母,等等。类似的过程用于(4)、(5)和(6)。如果可以构建进一步的近似值,那将很有趣,但下一阶段涉及的工作量将变得巨大。
The first two samples were constructed by the use of a book of random numbers in conjunction with (for example 2) a table of letter frequencies. This method might have been continued for (3), (4) and (5), since digram, trigram and word frequency tables are available, but a simpler equivalent method was used. To construct (3) for example, one opens a book at random and selects a letter at random on the page. This letter is recorded. The book is then opened to another page and one reads until this letter is encountered. The succeeding letter is then recorded. Turning to another page this second letter is searched for and the succeeding letter recorded, etc. A similar process was used for (4), (5) and (6). It would be interesting if further approximations could be constructed, but the labor involved becomes enormous at the next stage.
上述类型的随机过程在数学上被称为离散马尔可夫过程,并且在文献中得到了广泛的研究(详细处理请参见 Frechet [1938])。一般情况可以描述如下:系统存在有限数量的可能“状态”;S 1、S 2、…、S n。此外,还有一组转移概率p i ( j ),即如果系统处于状态S i则接下来进入状态S j的概率。为了使这个马尔可夫过程成为一个信息源,我们只需要假设从一种状态到另一种状态的每次转换都会产生一封信。这些州将对应于前面信件中的“影响残留”。
Stochastic processes of the type described above are known mathematically as discrete Markoff processes and have been extensively studied in the literature (for a detailed treatment, see Frechet [1938]). The general case can be described as follows: There exist a finite number of possible “states” of a system; S1, S2, …, Sn. In addition there is a set of transition probabilities, pi(j), the probability that if the system is in state Si it will next go to state Sj. To make this Markoff process into an information source we need only assume that a letter is produced for each transition from one state to another. The states will correspond to the “residue of influence” from preceding letters.
这种情况可以用图形表示,如图12.3、12.4和12.5所示。“状态”是图中的连接点,并且为转换产生的概率和字母在相应的线旁边给出。图 12.3是第12.2节中的示例 B ,而图 12.4对应于示例 C。在图 12.3中,只有一种状态,因为连续的字母是独立的。在图 12.4中,有与字母一样多的状态。如果构建一个三元组示例,则最多有n 2 个状态对应于所选字母之前的可能字母对。图12.5是例D中单词结构情况的图。这里S对应“空格”符号。……
The situation can be represented graphically as shown in Figures 12.3, 12.4 and 12.5. The “states” are the junction points in the graph and the probabilities and letters produced for a transition are given beside the corresponding line. Figure 12.3 is for the example B in §12.2, while Figure 12.4 corresponds to the example C. In Figure 12.3, there is only one state since successive letters are independent. In Figure 12.4 there are as many states as letters. If a trigram example were constructed there would be at most n2 states corresponding to the possible pairs of letters preceding the one being chosen. Figure 12.5 is a graph for the case of word structure in example D. Here S corresponds to the “space” symbol. …
图 12.3: 与示例 B 中的源相对应的图表。
Figure 12.3: A graph corresponding to the source in example B.
图 12.4: 与示例 C 中的源相对应的图表。
Figure 12.4: A graph corresponding to the source in example C.
图 12.5: 与示例 D 中的源相对应的图表。
Figure 12.5: A graph corresponding to the source in example D.
我们将离散信息源表示为马尔可夫过程。我们能否定义一个量来衡量,在某种意义上,这样的过程“产生”了多少信息,或者更好的是,信息产生的速度是多少?
We have represented a discrete information source as a Markoff process. Can we define a quantity which will measure, in some sense, how much information is “produced” by such a process, or better, at what rate information is produced?
假设我们有一组可能事件,其发生概率为p 1 , p 2 , … , p n。这些概率是已知的,但这就是我们所知道的关于哪个事件将会发生的全部。我们能否找到一个衡量标准来衡量事件的选择涉及多少“选择”,或者我们对结果的不确定性有多大?
Suppose we have a set of possible events whose probabilities of occurrence are p1, p2, …, pn. These probabilities are known but that is all we know concerning which event will occur. Can we find a measure of how much “choice” is involved in the selection of the event or of how uncertain we are of the outcome?
如果存在这样的测度,例如H ( p 1 , p 2 , … , p n ),则要求其具有以下属性是合理的:
If there is such a measure, say H(p1, p2, …, pn), it is reasonable to require of it the following properties:
1. H在pi中应该是连续的。
1. H should be continuous in the pi.
2. 如果所有p i都相等 ,则H应该是n的单调递增函数。对于同样可能的事件,当有更多可能的事件时,就有更多的选择或不确定性。
2. If all the pi are equal, , then H should be a monotonic increasing function of n. With equally likely events there is more choice, or uncertainty, when there are more possible events.
3. 如果一个选择被分解为两个连续的选择,则原始的H应该是H的各个值的加权和。其含义如图12.6所示。在左边我们有三种可能性。在右边,我们首先在两种可能性之间进行选择,每种可能性都有概率
,如果出现第二种可能性,则用概率做出另一种选择
。最终结果的概率与之前相同。在这种特殊情况下,我们要求
3. If a choice be broken down into two successive choices, the original H should be the weighted sum of the individual values of H. The meaning of this is illustrated in Figure 12.6. At the left we have three possibilities . On the right we first choose between two possibilities each with probability , and if the second occurs make another choice with probabilities . The final results have the same probabilities as before. We require, in this special case, that
图 12.6: 对三种可能性的选择进行分解。
Figure 12.6: Decomposition of a choice from three possibilities.
该系数是因为第二个选择只出现一半的时间。
The coefficient is because this second choice only occurs half the time.
在附录2[编辑:省略]中,建立了以下结果:
In Appendix 2 [EDITOR: omitted], the following result is established:
定理2:满足上述三个假设的唯一H的形式为:
Theorem 2: The only H satisfying the three above assumptions is of the form:
其中 K 是正常数。
where K is a positive constant.
这个定理以及证明它所需的假设对于本理论来说并不是必需的。给出它主要是为了给我们后来的一些定义提供一定的合理性。然而,这些定义的真正合理性在于它们的含义。
This theorem, and the assumptions required for its proof, are in no way necessary for the present theory. It is given chiefly to lend a certain plausibility to some of our later definitions. The real justification of these definitions, however, will reside in their implications.
H = −Σ p i log p i形式的量(常数K仅相当于测量单位的选择)在信息论中作为信息、选择和不确定性的测量发挥着核心作用。H的形式将被认为是统计力学的某些公式中定义的熵的形式,其中p i是系统位于其相空间的单元i中的概率。例如,H就是玻尔兹曼著名的H定理中的H。我们将H = −Σ p i log p i称为概率集p 1 , … , p n的熵。如果x是机会变量,我们将其熵写为H ( x );因此x不是函数的参数,而是数字的标签,以区别于H ( y ),即机会变量y的熵。
Quantities of the form H = −∑ pi log pi (the constant K merely amounts to a choice of a unit of measure) play a central role in information theory as measures of information, choice and uncertainty. The form of H will be recognized as that of entropy as defined in certain formulations of statistical mechanics where pi is the probability of a system being in cell i of its phase space. H is then, for example, the H in Boltzmann’s famous H theorem. We shall call H = −∑ pi log pi the entropy of the set of probabilities p1, …, pn. If x is a chance variable we will write H(x) for its entropy; thus x is not an argument of a function but a label for a number, to differentiate it from H(y) say, the entropy of the chance variable y.
概率为p和q = 1 − p的两种可能性情况下的熵,即
The entropy in the case of two possibilities with probabilities p and q = 1 − p, namely
在图 12.7中绘制为p的函数。
is plotted in Figure 12.7 as a function of p.
图 12.7:概率为 p和 (1 − p )的两种可能性情况下的熵。
Figure 12.7: Entropy in the case of two possibilities with probabilities p and (1 − p).
数量H有许多有趣的属性,进一步证实它是选择或信息的合理度量。
The quantity H has a number of interesting properties which further substantiate it as a reasonable measure of choice or information.
1. H = 0 当且仅当所有p i除一个为零外,该 p i 的值为 1。因此,只有当我们确定结果时,H才会消失。否则H为正。
1. H = 0 if and only if all the pi but one are zero, this one having the value unity. Thus only when we are certain of the outcome does H vanish. Otherwise H is positive.
2. 对于给定的n,当所有pi都相等(即)时, H是最大值并且等于 log n。这也是直观上最不确定的情况。
2. For a given n, H is a maximum and equal to log n when all the pi are equal (i.e., ). This is also intuitively the most uncertain situation.
3. 假设有两个事件x和y ,第一个事件有m 种可能性,第二个事件有n 种可能性。设p ( i, j )为第一个事件i和第二个事件j联合出现的概率。联合事件的熵为
3. Suppose there are two events, x and y, in question with m possibilities for the first and n for the second. Let p(i, j) be the probability of the joint occurrence of i for the first and j for the second. The entropy of the joint event is
尽管
while
很容易证明H ( x, y ) ≤ H ( x ) + H ( y ),只有当事件是独立的时才相等(即p ( i, j ) = p ( i ) p ( j ))。联合事件的不确定性小于或等于单个事件的不确定性之和。
It is easily shown that H(x, y) ≤ H(x) + H(y), with equality only if the events are independent (i.e., p(i, j) = p(i)p(j)). The uncertainty of a joint event is less than or equal to the sum of the individual uncertainties.
4. 概率p 1 , p 2 , … , p n的任何朝向均衡的变化都会增加H。因此,如果p 1 < p 2并且我们增加p 1,同时减少p 2等量,使得p 1和p 2更接近相等,则H增加。更一般地,如果我们对形式的p i执行任何“平均”操作
4. Any change toward equalization of the probabilities p1, p2, …, pn increases H. Thus if p1 < p2 and we increase p1, decreasing p2 an equal amount so that p1 and p2 are more nearly equal, then H increases. More generally, if we perform any “averaging” operation on the pi of the form
其中,并且所有a ij ≥ 0,则H增加(除非在特殊情况下,此变换相当于p j的排列,而H当然保持不变)。
where , and all aij ≥ 0, then H increases (except in the special case where this transformation amounts to no more than a permutation of the pj with H of course remaining the same).
5. 假设有两个机会事件x和y,如 3 中所示,不一定是独立的。对于任何特定值i,x可以假设存在y具有值j的条件概率p i ( j ) 。这是由下式给出的
5. Suppose there are two chance events x and y as in 3, not necessarily independent. For any particular value i that x can assume there is a conditional probability pi(j) that y has the value j. This is given by
我们将y的条件熵定义为每个x值的y熵的平均值,H x ( y )根据获得该特定x的概率进行加权。那是
We define the conditional entropy of y, Hx(y) as the average of the entropy of y for each value of x, weighted according to the probability of getting that particular x. That is
这个量衡量了当我们知道x时我们对y的平均不确定性。代入p i ( j )的值我们得到
This quantity measures how uncertain we are of y on the average when we know x. Substituting the value of pi(j) we obtain
或H ( x, y ) = H ( x ) + H x ( y )。联合事件x , y的不确定性(或熵)是x已知时x的不确定性加上y的不确定性。
or H(x, y) = H(x) + Hx(y). The uncertainty (or entropy) of the joint event x, y is the uncertainty of x plus the uncertainty of y when x is known.
6. 由 3 和 5 可知H ( x ) + H ( y ) ≥ H ( x, y ) = H ( x ) + H x ( y )。因此H ( y ) ≥Hx ( y )。y的不确定性永远不会因为x的知识而增加。除非x和y是独立事件,否则它将减少,在这种情况下它不会改变。
6. From 3 and 5 we have H(x) + H(y) ≥ H(x, y) = H(x) + Hx(y). Hence H(y) ≥ Hx(y). The uncertainty of y is never increased by knowledge of x. It will be decreased unless x and y are independent events, in which case it is not changed.
考虑上面考虑的有限状态类型的离散源。对于每个可能的状态i ,将有一组产生各种可能的符号j的概率p i ( j ) 。因此每个状态都有一个熵H i 。源的熵将被定义为这些的平均值H i根据相关状态发生的概率进行加权:
Consider a discrete source of the finite state type considered above. For each possible state i there will be a set of probabilities pi(j) of producing the various possible symbols j. Thus there is an entropy Hi for each state. The entropy of the source will be defined as the average of these Hi weighted in accordance with the probability of occurrence of the states in question:
这是每个文本符号的源的熵。如果马尔可夫过程以确定的时间速率进行,则每秒还会有一个熵,其中f i是状态i的平均频率(每秒发生次数)。显然,H ′ = mH,其中m是每秒产生的符号的平均数量。H或H ' 测量源每个符号或每秒生成的信息量。如果对数基数为 2,则它们将表示每个符号或每秒的位数。
This is the entropy of the source per symbol of text. If the Markoff process is proceeding at a definite time rate there is also an entropy per second where fi is the average frequency (occurrences per second) of state i. Clearly H′ = mH where m is the average number of symbols produced per second. H or H′ measures the amount of information generated by the source per symbol or per second. If the logarithmic base is 2, they will represent bits per symbol or per second.
如果连续符号是独立的,则H就是 −Σ p i log p i ,其中p i是符号i的概率。假设在这种情况下我们考虑一条包含N 个符号的长消息。它将以高概率包含第一个符号的p 1 N次出现,第二个符号的p 2 N次出现,依此类推。因此,该特定消息的概率将大致为
If successive symbols are independent then H is simply −∑ pi log pi where pi is the probability of symbol i. Suppose in this case we consider a long message of N symbols. It will contain with high probability about p1N occurrences of the first symbol, p2N occurrences of the second, etc. Hence the probability of this particular message will be roughly
或者
or
因此, H近似为典型长序列的倒数概率除以序列中的符号数量的对数。同样的结果适用于任何来源。……
H is thus approximately the logarithm of the reciprocal probability of a typical long sequence divided by the number of symbols in the sequence. The same result holds for any source. …
经诺基亚贝尔实验室许可,转载自 Shannon (1948)。
Reprinted from Shannon (1948), with permission from Nokia Bell Labs.
今天我们理所当然地认为,如果一个数字存储在计算机内存中然后检索,则从内存中出来的数字与进入的数字相同。如果存储了 π 的值,那么当程序使用该值时每次的值都是 3.14159,而不是 3.24159 或 3.14158。类似地,任何旨在访问内存位置 2468 的程序都不会访问位置 2478。毫无疑问,计算机出现故障是由于编程错误、短路和错误的输入数据。但这些比特本身是可靠正确的,即使是跨越半个地球或从深空探测器传输,即使它们存储和移动的物理物质是连续的、不完美的,并且服从统计物理定律。当数据确实出现乱码时,我们的计算机往往会告诉我们这一点。
We take it for granted today that if a number is stored in computer memory and then retrieved, the number that comes out of memory is the same as the number that went in. If the value of π is stored, then when a program uses that value it will be 3.14159 every time, never 3.24159 or 3.14158. Similarly, no program meant to access memory location 2468 will ever access location 2478 instead. Computers malfunction, to be sure, because of programming errors, short circuits, and bad input data. But the bits themselves are reliably correct, even if transmitted halfway around the world or from a probe in deep space, and even though the physical stuff in which they are stored and moved is continuous, imperfect, and subject to the laws of statistical physics. When data does become garbled, our computers tend to tell us so.
情况并非总是如此。
It was not always so.
由于早期机械计算设备的继电器和齿轮非常不可靠,因此其中一些设备使用奇偶校验码来检测何时出现错误。例如,如果有四个数据位,第五位将是其他四个数据位的 mod 2 和,即,如果四个数据位中有偶数个 1,则为 0;如果有奇数个数据位,则为 1。四个数据位中的1。
Because the relays and gears of early mechanical calculating equipment were so unreliable, some of those devices used parity codes to detect when a bit was in error. If there were four data bits, for example, a fifth bit would be the mod 2 sum of the other four, that is, 0 if there were an even number of 1s among the four data bits and 1 if there were an odd number of 1s among the four data bits.
理查德·哈明 (Richard Hamming,1915-1998) 与约翰·冯·诺依曼 (John von Neumann) 一起参与曼哈顿计划,然后于 1946 年作为数学家加入贝尔电话实验室。在那里,他与克劳德·香农 (Claude Shannon) 共用一间办公室,并开始将他在洛斯阿拉莫斯获得的计算经验与新兴的信息论科学。图灵奖随附的传记描述了 1947 年发生的事情,使他开辟了一个全新的领域。
Richard Hamming (1915–1998) worked with John von Neumann on the Manhattan Project before joining Bell Telephone Labs as a mathematician in 1946. There he shared an office with Claude Shannon and began to marry the computing experience he had gained at Los Alamos with the emerging science of information theory. The biography accompanying his Turing Award describes what happened in 1947 that caused him to open up an entirely new field.
一个星期五,在贝尔实验室工作时,他设置了他们的预计算机计算机来解决一个复杂的问题,并期望在下周一开始工作时得到结果。但当他周一到达时,他发现计算初期就出现了错误,基于继电器的计算器无法继续进行。(美国计算机学会,1968)
One Friday, while working for Bell Laboratories, he set their pre-computer calculating machines to solving a complex problem and expected the result to be waiting for him when he began work on the following Monday. But when he arrived on Monday, he found that an error had occurred early on in the calculations and the relay-based calculators had been unable to proceed. (ACM, 1968)
奇偶校验失败,整个计算就停止了。汉明通过培训一位读过布尔思维定律的数学家,意识到如果计算机能够找出错误,那么计算机也可能能够找出错误在哪里,并纠正它。因此诞生了纠错码的想法,在这种情况下四位数据还需要三位。(事实上,本卷第 92 页的“初稿”EDVAC 报告已经指出,虽然硬件错误是不可避免的,但有些错误可能会自动纠正。)
A parity check had failed and the entire calculation simply stopped. Hamming, by training a mathematician who had appreciatively read Boole’s Laws of Thought, realized that if a computer could figure out that there was a mistake, a computer might also be able to figure out where the mistake was, and correct it. Thus was born the idea of an error correcting code, which in the case of a four-bit datum requires three more bits. (In fact, the “First Draft” EDVAC report, page 92 of this volume, had already observed that while hardware errors were inevitable, some might be corrected automatically.)
在这篇论文中,汉明首先提出了一种具体的单纠错码,然后建立了一个通用理论。他将两个位向量之间的距离定义为它们不同的位置数,这种度量现在普遍称为汉明距离。不会因一位错误而相互混淆的两个位向量彼此之间的距离至少为 2。因此,找到一个n位代码(例如简单的奇偶校验码)来识别一位错误相当于在 {0, 1} n中找到一组点(称为码字),其中没有两个点之间的距离小于 2他们。如果码字之间的最小距离为 3,则代码可以纠正一位错误,因为任何位向量与一个码字的距离只能为 1。因此,设计代码就变成了在 {0, 1} n中打包非重叠球体(距给定点距离恒定的点)的问题。
In this paper Hamming first improvises a specific single-error-correcting code and then establishes a general theory. He defines the distance between two bit vectors as the number of positions in which they differ, a measure now universally known as the Hamming distance. Two bit vectors that cannot be confused with one another by a one-bit error are at distance at least 2 from each other. So finding an n-bit code, like a simple parity code, that identifies one-bit errors is equivalent to finding a set of points in {0, 1}n, called codewords, in which no two points have distance less than 2 between them. If the minimum distance between codewords is 3, the code can correct single-bit errors, since any bit vector can be at distance 1 from only one codeword. Designing codes thus becomes a problem of packing non-overlapping spheres (points of constant distance from a given point) in {0, 1}n.
汉明的论文发表后,该领域爆发了。现代存储设备中位的物理尺寸很小,但数量却很大,因此位级别的错误是不可避免的。如今,计算机通过基于汉明最初开发的方法在微观层面包含精心设计的冗余(用户不可见)来实现宏观完美。
The field exploded after the publication of Hamming’s paper. The physical size of bits in modern storage devices is so small, and their number so large, that errors at the bit level are inevitable. Computers achieve macroscopic perfection today by including carefully designed redundancy at the microscopic level, invisible to users, by methods built on those Hamming first developed.
理查德·汉明 (Richard Hamming) 对通信理论做出了许多其他贡献,并于 1976 年转向学术界。直到 82 岁去世前几个月,他一直是改进数学教学的支持者。
Richard Hamming made many other contributions to communications theory, and in 1976 transitioned to academia. He was a proponent of improved teaching of mathematics until months before his death at the age of 82.
作者出于对大型计算机的考虑而进行了本文的研究,其中必须执行大量操作,而最终结果不会出现任何错误。大规模“正确做事”的问题本质上并不是什么新鲜事。例如,在电话中心局中,执行大量操作,同时导致错误号码的错误得到很好的控制,尽管这些错误尚未完全消除。这在一定程度上是通过使用自检电路来实现的。偶尔逃过例行检查的故障仍然会被客户发现,如果持续存在,将导致客户投诉。而如果它是暂时的,它只会偶尔产生错误的数字。与此同时,中央办公室的其余部分运转良好。另一方面,在数字计算机中,单个故障通常意味着完全故障,从某种意义上说,如果检测到故障,则在定位并纠正故障之前无法进行更多计算,而如果它逃脱了检测,则所有故障都无效。机器的后续操作。换句话说,在电话中心局中有许多并行路径,这些路径或多或少是彼此独立;在数字机器中,通常有一条长路径,在获得答案之前多次经过同一台设备。
THE author was led to the study given in this paper from a consideration of large scale computing machines in which a large number of operations must be performed without a single error in the end result. This problem of “doing things right” on a large scale is not essentially new; in a telephone central office, for example, a very large number of operations are performed while the errors leading to wrong numbers are kept well under control, though they have not been completely eliminated. This has been achieved, in part, through the use of self-checking circuits. The occasional failure that escapes routine checking is still detected by the customer and will, if it persists, result in customer complaint. While if it is transient it will produce only occasional wrong numbers. At the same time the rest of the central office functions satisfactorily. In a digital computer, on the other hand, a single failure usually means the complete failure, in the sense that if it is detected no more computing can be done until the failure is located and corrected, while if it escapes detection then it invalidates all subsequent operations of the machine. Put in other words, in a telephone central office there are a number of parallel paths which are more or less independent of each other; in a digital machine there is usually a single long path which passes through the same piece of equipment many, many times before the answer is obtained.
在将信息从一个地方传输到另一个地方时,数字机器使用代码,这些代码只是附加了含义或值的符号集。旨在检测孤立错误的代码示例有很多;其中包括高度发达的五分之二代码,广泛用于公共控制交换系统和贝尔中继计算机(Alt,1948a,b),用于无线电报的五分之三代码(Sparks和Kreer,1947年,特别是第 417 页),以及电报末尾发送的字数统计。
In transmitting information from one place to another digital machines use codes which are simply sets of symbols to which meanings or values are attached. Examples of codes which were designed to detect isolated errors are numerous; among them are the highly developed 2 out of 5 codes used extensively in common control switching systems and in the Bell Relay Computers (Alt, 1948a,b), the 3 out of 5 code used for radio telegraphy (Sparks and Kreer, 1947, especially page 417), and the word count sent at the end of telegrams.
在某些情况下,自我检查是不够的。例如,在贝尔电话实验室为阿伯丁试验场建造的 5 型中继计算机中,早期的观察表明,两台计算机的 8900 个继电器中,每天约有 2 到 3 个继电器故障,相当于大约每 2 台计算机中就有 1 次故障。 300万次中继操作。自检功能意味着这些故障不会引入未检测到的错误。由于机器在夜间和周末无人值守的情况下运行。然而,这些错误意味着计算经常会停止,尽管机器经常会遇到新问题。目前的趋势是数字计算机的电子速度,其中基本元件的每次操作比继电器更可靠。然而,孤立故障的发生,即使被发现,也可能严重干扰此类机器的正常使用。因此,似乎需要检查错误检测之外的下一步,即错误纠正。
In some situations self checking is not enough. For example, in the Model 5 Relay Computers built by Bell Telephone Laboratories for the Aberdeen Proving Grounds, observations in the early period indicated about two or three relay failures per day in the 8900 relays of the two computers, representing about one failure per two to three million relay operations. The self-checking feature meant that these failures did not introduce undetected errors. Since the machines were run on an unattended basis over nights and week-ends. However, the errors meant that frequently the computations came to a halt although often the machines took up new problems. The present trend is toward electronic speeds in digital computers where the basic elements are somewhat more reliable per operation than relays. However, the incidence of isolated failures, even when detected, may seriously interfere with the normal use of such machines. Thus it appears desirable to examine the next step beyond error detection, namely error correction.
我们假设传输设备以 0 和 1 序列的二进制形式处理信息。做出这种假设既是为了数学上的方便,也是因为二进制系统是表示在多种通信形式中使用的开路和闭路继电器、触发器电路、点和划线以及穿孔磁带的自然形式。因此,每个代码符号将由 0 和 1 的序列表示。
We shall assume that the transmitting equipment handles information in the binary form of a sequence of 0s and 1s. This assumption is made both for mathematical convenience and because the binary system is the natural form for representing the open and closed relays, flip-flop circuits, dots and dashes, and perforated tapes that are used in many forms of communication. Thus each code symbol will be represented by a sequence of 0s and 1s.
本文使用的代码称为系统代码。系统码可以定义为每个代码符号恰好有n 个二进制数字的代码,其中m 个数字与信息相关,而其他k = n − m个数字用于错误检测和纠正。这会产生冗余 R,定义为所使用的二进制位数与传达相同信息所需的最小位数的比率,即
The codes used in this paper are called systematic codes. Systematic codes may be defined as codes in which each code symbol has exactly n binary digits, where m digits are associated with the information while the other k = n − m digits are used for error detection and correction. This produces a redundancy R defined as the ratio of the number of binary digits used to the minimum number necessary to convey the same information, that is,
就信息传输而言,这用于衡量代码的效率,并且是这里详细讨论的问题的唯一方面。冗余可以说降低了发送信息的有效信道容量。
This serves to measure the efficiency of the code as far as the transmission of information is concerned, and is the only aspect of the problem discussed in any detail here. The redundancy may be said to lower the effective channel capacity for sending information.
纠错的需求直到最近才变得重要,但人们对此事的经济学知之甚少。显然,在这样的代码中将存在用于编码和纠错的额外设备以及上述降低的有效信道容量。由于这些考虑,这些代码的应用可能预计仅在极端条件下首先发生。一些典型的情况似乎是:
The need for error correction having assumed importance only recently, very little is known about the economics of the matter. It is clear that in such codes there will be extra equipment for encoding and correcting errors as well as the lowered effective channel capacity referred to above. Because of these considerations applications of these codes may be expected to occur first only under extreme conditions. Some typical situations seem to be:
A。使用最少的备用设备长时间无人值守运行。
a. unattended operation over long periods of time with the minimum of standby equipment.
b. 非常庞大且紧密关联的系统,其中一个故障就会导致整个装置瘫痪。
b. extremely large and tightly interrelated systems where a single failure incapacitates the entire installation.
C。在存在噪声的情况下发送信号,而降低噪声对信号的影响是不可能或不经济的。
c. signaling in the presence of noise where it is either impossible or uneconomical to reduce the effect of the noise on the signal.
这些情况发生得越来越频繁。前两种情况对于大型数字计算机尤其如此,而第三种情况则发生在“干扰”情况下。
These situations are occurring more and more often. The first two are particularly true of large scale digital computing machines, while the third occurs, among other places, in “jamming” situations.
本文给出了最有可能首先应用的情况下设计检错和纠错码的原则。可以通过应用众所周知的技术来设计用于实现这些原理的电路,但是这里不讨论这个问题。本文的第一部分展示了如何在以下情况下构造特殊的最小冗余码:
The principles for designing error detecting and correcting codes in the cases most likely to be applied first are given in this paper. Circuits for implementing these principles may be designed by the application of well-known techniques, but the problem is not discussed here. Part I of the paper shows how to construct special minimum redundancy codes in the following cases:
A。单个错误检测代码
a. single error detecting codes
b. 单一纠错码
b. single error correcting codes
C。单纠错加双检错码。
c. single error correcting plus double error detecting codes.
第二部分讨论了此类代码的一般理论,并证明在假设下第一部分的代码是“最好的”可能。
Part II discusses the general theory of such codes and proves that under the assumptions made the codes of Part I are the “best” possible.
第一部分:特殊代码
PART I: SPECIAL CODES
我们可以通过以下方式构建具有n 个二进制数字的单个错误检测代码:在前n - 1 个位置中,我们放置n - 1 位信息。在第 n个位置,我们放置 0 或 1,这样整个n个位置就有偶数个 1。这显然是单个错误检测代码,因为传输中的任何单个错误都会在代码符号中留下奇数个 1。
We may construct a single error detecting code having n binary digits in the following manner: In the first n − 1 positions we put n − 1 digits of information. In the nth position we place either 0 or 1, so that the entire n positions have an even number of 1s. This is clearly a single error detecting code since any single error in transmission would leave an odd number of 1s in a code symbol.
这些代码的冗余是,因为m = n − 1,
The redundancy of these codes is, since m = n − 1,
看起来,为了获得低冗余,我们应该让n变得非常大。然而,通过增加n,符号中至少出现一个错误的概率会增加;而且,出现未被发现的双重错误的风险也会增加。例如,如果p < 1 是任何错误的概率,那么对于n如此大的 1 /p,正确符号的概率约为 1 /e = 0.3679 …,而双重错误的概率为 1/2 e = 0.1839 …… _
It might appear that to gain a low redundancy we should let n become very large. However, by increasing n, the probability of at least one error in a symbol increases; and the risk of a double error, which would pass undetected, also increases. For example, if p ≪ 1 is the probability of any error, then for n so large as 1/p, the probability of a correct symbol is approximately 1/e = 0.3679…, while a double error has probability 1/2e = 0.1839….
上面用于确定符号是否有任何单一错误的检查类型将在整篇论文中使用,称为奇偶校验。以上是偶校验;如果我们使用奇数个 1 来确定检查位置的设置这将是一个奇怪的奇偶校验。此外,奇偶校验不需要总是涉及符号的所有位置,而是可以仅对选定位置进行检查。
The type of check used above to determine whether or not the symbol has any single error will be used throughout the paper and will be called a parity check. The above was an even parity check; had we used an odd number of 1s to determine the setting of the check position it would have been an odd parity check. Furthermore, a parity check need not always involve all the positions of the symbol but may be a check over selected positions only.
为了构建单个纠错码,我们首先将n个可用位置中的m 个指定为信息位置。我们认为m是固定的,但具体位置留待以后确定。接下来我们将剩余的k 个位置指定为检查位置。这k个位置中的值将在编码过程中通过对所选信息位置进行奇偶校验来确定。
To construct a single error correcting code we first assign m of the n available positions as information positions. We shall regard the m as fixed, but the specific positions are left to a later determination. We next assign the k remaining positions as check positions. The values in these k positions are to be determined in the encoding process by even parity checks over selected information positions.
让我们暂时想象一下,我们收到了一个代码符号,无论有没有错误。让我们按顺序应用k 个奇偶校验,每次奇偶校验分配在其检查位置观察到的值时,我们写入 0,而每次分配的值和观察到的值不一致时,我们写入 1。留在一行中的这个k 个0 和 1 的序列(以区别于奇偶校验分配的值)可以被视为二进制数并且将被称为校验数。我们要求这个检查数字给出任何单个错误的位置,零值意味着符号中没有错误。因此,校验数必须描述m + k + 1 个不同的事物,因此 2 k ≥ m + k + 1 是k的条件。写n = m + k我们发现
Let us imagine for the moment that we have received a code symbol, with or without an error. Let us apply the k parity checks in order, and for each time the parity check assigns the value observed in its check position we write a 0, while for each time the assigned and observed values disagree we write a 1. When written from right to left in a line this sequence of k 0s and 1s (to be distinguished from the values assigned by the parity checks) may be regarded as a binary number and will be called the checking number. We shall require that this checking number give the position of any single error, with the zero value meaning no error in the symbol. Thus the check number must describe m + k + 1 different things, so that 2k ≥ m + k + 1 is a condition on k. Writing n = m + k we find
使用这个不等式,我们可以计算图 13.1 ,它给出给定n的最大值m,或者,相同的是,给出给定m的最小值n。
Using this inequality we may calculate Figure 13.1, which gives the maximum m for a given n, or, what is the same thing, the minimum n for a given m.
我们现在确定要应用各种奇偶校验的位置。校验码是从右到左逐位进行奇偶校验,并根据具体情况记下相应的0或1而得到的。由于检查数是给出代码符号中任何错误的位置,因此其二进制表示形式右侧有 1 的任何位置必定会导致第一次检查失败。检查二进制形式我们找到的各种整数
We now determine the positions over which each of the various parity checks is to be applied. The checking number is obtained digit by digit, from right to left, by applying the parity checks in order and writing down the corresponding 0 or 1 as the case may be. Since the checking number is to give the position of any error in a code symbol, any position which has a 1 on the right of its binary representation must cause the first check to fail. Examining the binary form of the various integers we find
最右边有一个 1。因此,第一个奇偶校验必须使用位置 1, 3, 5, 7, 9 , ...。
have a 1 on the extreme right. Thus the first parity check must use positions 1, 3, 5, 7, 9, ….
以完全相同的方式,我们发现第二个奇偶校验必须使用那些二进制表示右侧第二个数字为 1 的位置,
In an exactly similar fashion we find that the second parity check must use those positions which have 1s for the second digit from the right of their binary representation,
第三次奇偶校验
the third parity check
对于每个奇偶校验,仍然需要决定哪些位置包含信息以及哪些位置要进行检查。如下表所示,选择位置 1、2、4、8 ...作为检查位置的优点是使检查位置的设置彼此独立。所有其他位置都是信息位置。由此我们得到图13.2。
It remains to decide for each parity check which positions are to contain information and which the check. The choice of the positions 1, 2, 4, 8, … for check positions, as given in the following table, has the advantage of making the setting of the check positions independent of each other. All other positions are information positions. Thus we obtain Figure 13.2.
作为上述理论的说明,我们将其应用于七位代码的情况。从图13.1中我们发现n = 7,m = 4和k = 3。从图13.2中我们发现第一个奇偶校验涉及位置1、3、5、7,用于确定第一个位置的值;第二次奇偶校验,位置2、3、6、7,并确定第二位置的值;和第三次奇偶校验。位置 4、5、6、7,并确定位置 4 的值。这使得位置 3、5、6、7 成为信息位置。使用位置 3、5、6、7 写下所有可能的二进制数,然后计算检查位置 1、2、4 中的值,结果如图 13.3所示。
As an illustration of the above theory we apply it to the case of a seven-position code. From Figure 13.1 we find for n = 7, m = 4 and k = 3. From Figure 13.2 we find that the first parity check involves positions 1, 3, 5, 7 and is used to determine the value in the first position; the second parity check, positions 2, 3, 6, 7, and determines the value in the second position; and the third parity check. positions 4, 5, 6, 7, and determines the value in position four. This leaves positions 3, 5, 6, 7 as information positions. The results of writing down all possible binary numbers using positions 3, 5, 6, 7, and then calculating the values in the check positions 1, 2, 4, are shown in Figure 13.3.
因此,七位单纠错码允许 16 个代码符号。当然,有 2 7 − 16 = 112 个无意义的符号。在某些应用中,可能需要从代码中删除第一个符号,以避免全零组合作为代码符号或代码符号加上单个错误,因为这可能会与没有消息相混淆。这仍然会留下 15 个有用的代码符号。
Thus a seven-position single error correcting code admits of 16 code symbols. There are, of course, 27 − 16 = 112 meaningless symbols. In some applications it may be desirable to drop the first symbol from the code to avoid the all zero combination as either a code symbol or a code symbol plus a single error, since this might be confused with no message. This would still leave 15 useful code symbols.
作为此代码如何“工作”的说明,让我们采用与十进制值 12 相对应的符号 0 1 1 1 1 0 0,并将第五位的 1 更改为 0。现在我们检查新符号 0 1 1 1 0 0 0 通过本节的方法查看错误是如何定位的。从图 13.2中,第一个奇偶校验检查位置 1、3、5、7,并预测第一个位置为 1,而我们在那里找到 0;因此我们写 1。第二个奇偶校验检查位置 2、3、6、7,并正确预测第二个位置:因此我们在 1 的左边写一个 0,得到 0 1。第三个奇偶校验检查位置4、5、6、7,预测错误:所以我们在 0 1 的左边写了一个 1,得到 1 0 1。这个 0 和 1 的序列被视为二进制数,就是数字 5;因此错误位于第五位。因此,将第五位的 0 改为 1 即可获得正确的符号。
As an illustration of how this code “works” let us take the symbol 0 1 1 1 1 0 0 corresponding to the decimal value 12 and change the 1 in the fifth position to a 0. We now examine the new symbol 0 1 1 1 0 0 0 by the methods of this section to see how the error is located. From Figure 13.2 the first parity check is over positions 1, 3, 5, 7 and predicts a 1 for the first position while we find a 0 there; hence we write a 1. The second parity check is over positions 2, 3, 6, 7 and predicts the second position correctly: hence we write a 0 to the left of the 1, obtaining 0 1. The third parity check is over positions 4, 5, 6, 7 and predicts wrongly: hence we write a 1 to the left of the 0 1, obtaining 1 0 1. This sequence of 0s and 1s regarded as a binary number is the number 5; hence the error is in the fifth position. The correct symbol is therefore obtained by changing the 0 in the fifth position to a 1.
为了构建单纠错加双检错码,我们从单纠错码开始。在此代码中,我们使用偶校验检查添加了一个位置来检查所有先前的位置。要了解此代码的操作,我们必须检查多种情况:
To construct a single error correcting plus double error detecting code we begin with a single error correcting code. To this code we add one more position for checking all the previous positions, using an even parity check. To see the operation of this code we have to examine a number of cases:
1. 没有错误。所有奇偶校验(包括最后的奇偶校验)均得到满足。
1. No errors. All parity checks, including the last, are satisfied.
2.单一错误。在所有这些情况下,最后的奇偶校验都会失败,无论错误是在信息、原始检查位置还是最后的检查位置。原始检查编号给出了错误的位置,现在零值表示最后的检查位置。
2. Single error. The last parity check fails in all such situations whether the error be in the information, the original check positions, or the last check position. The original checking number gives the position of the error, where now the zero value means the last check position.
3.两个错误。在所有这些情况下,最后的奇偶校验都得到满足,并且检查编号指示某种错误。
3. Two errors. In all such situations the last parity check is satisfied, and the checking number indicates some kind of error.
作为说明,让我们从之前的七位代码构造一个八位代码。为此,我们添加第八个位置,选择该位置以使这八个位置中有偶数个 1。因此,我们在图 13.3中添加第八列(参见图 13.4)。
As an illustration let us construct an eight-position code from the previous seven-position code. To do this we add an eighth position which is chosen so that there are an even number of 1s in the eight positions. Thus we add an eighth column to Figure 13.3 (see Figure 13.4).
第二部分。一般理论
PART II. GENERAL THEORY
当检查与检错和纠错码相关的各种问题时,引入几何模型通常很方便。这里使用的模型包括识别 0 和 1 的各种序列,它们是具有单位n的顶点的代码的符号-维度立方体。标记为x、y、z、…的代码点形成立方体所有顶点集合的子集。在这个由 2 n个点组成的空间中,我们引入一个距离,或者通常所说的度量, D ( x , y )。度量的定义基于这样的观察:代码点中的单个错误会改变一个坐标,两个错误会改变两个坐标,并且通常 d 个错误会产生d坐标的差异。因此,我们将两点x和y之间的距离D ( x, y ) 定义为x和y不同的坐标数。这与从x到y必须遍历的最少边数相同。该距离函数满足度量通常的三个条件,即
When examining various problems connected with error detecting and correcting codes it is often convenient to introduce a geometric model. The model used here consists in identifying the various sequences of 0s and 1s which are the symbols of a code with vertices of a unit n-dimensional cube. The code points, labelled x, y, z, …, form a subset of the set of all vertices of the cube. Into this space of 2n points we introduce a distance, or, as it is usually called, a metric, D(x, y). The definition of the metric is based on the observation that a single error in a code point changes one coordinate, two errors, two coordinates, and in general d errors produce a difference in d coordinates. Thus we define the distance D(x, y) between two points x and y as the number of coordinates for which x and y are different. This is the same as the least number of edges which must be traversed in going from x to y. This distance function satisfies the usual three conditions for a metric, namely,
作为一个例子,我们注意到三维立方体中的以下每个代码点都与其他代码点相距两个单位:0 0 1;0 1 0; 1 0 0; 1 1 1. 继续使用几何语言,围绕点x半径为r的球体定义为距点x距离为r的所有点。因此,在上面的示例中,前三个代码点位于以点 (1, 1, 1) 为中心的半径为 2 的球体上。事实上,在此示例中,可以选择任何一个代码点作为中心,而其他三个代码点将位于半径为 2 的球体的表面上。
As an example we note that each of the following code points in the three-dimensional cube is two units away from the others: 0 0 1; 0 1 0; 1 0 0; 1 1 1. To continue the geometric language, a sphere of radius r about a point x is defined as all points which are at a distance r from the point x. Thus, in the above example, the first three code points are on a sphere of radius 2 about the point (1, 1, 1). In fact, in this example any one code point may be chosen as the center and the other three will lie on the surface of a sphere of radius 2.
如果所有代码点彼此之间的距离至少为 2,则任何单个错误都会将代码点转移到不是代码点的点,因此是一个无意义的符号。这又意味着任何单个错误都是可检测到的。如果码点之间的最小距离至少为三个单位,则任何单个错误都会使该点比任何其他码点更接近正确的码点,这意味着任何单个错误都是可纠正的。图 13.5总结了此类信息。
If all the code points are at a distance of at least 2 from each other, then it follows that any single error will carry a code point over to a point that is not a code point, and hence is a meaningless symbol. This in turn means that any single error is detectable. If the minimum distance between code points is at least three units then any single error will leave the point nearer to the correct code point than to any other code point, and this means that any single error will be correctable. This type of information is summarized in Figure 13.5.
相反,很明显,如果我们要实现列出的检测和校正,则码点之间的所有距离必须等于或超过列出的最小距离。因此,寻找合适的代码的问题与寻找空间中至少保持最小距离条件的点的子集的问题相同。§§ 13.2、13.3和13.4中的特殊代码仅描述了如何分别为最小距离2、3和4选择特定的点子集。
Conversely, it is evident that, if we are to effect the detection and correction listed, then all the distances between code points must equal or exceed the minimum distance listed. Thus the problem of finding suitable codes is the same as that of finding subsets of points in the space which maintain at least the minimum distance condition. The special codes in §§13.2, 13.3, and 13.4 were merely descriptions of how to choose a particular subset of points for minimum distances 2, 3, and 4 respectively.
或许应该指出的是,在给定的最小距离处,一些可校正性可以被交换为更多的可检测性。例如,最小距离为 5 的子集可用于:
It should perhaps be noted that, at a given minimum distance, some of the correctability may be exchanged for more detectability. For example, a subset with minimum distance 5 may be used for:
A。双错误校正(当然,还有双错误检测)
a. double error correction (with, of course, double error detection)
b. 单错误纠正加三错误检测
b. single error correction plus triple error detection
C。四重错误检测。
c. quadruple error detection.
现在回到第一部分中构建的特定代码,我们注意到代码中位置的任何交换都不会以任何本质的方式改变代码。交换任意位置的 0 和 1 也不行,这一过程通常称为补码。这个想法在以下定义中更加精确:
Returning for the moment to the particular codes constructed in Part I we note that any interchanges of positions in a code do not change the code in any essential way. Neither does interchanging the 0s and 1s in any position, a process usually called complementing. This idea is made more precise in the following definition:
定义。如果通过以下有限数量的操作,可以将一个代码转换为另一个代码,则称两个代码彼此等效:
Definition. Two codes are said to be equivalent to each other if, by a finite number of the following operations, one can be transformed into the other:
1、代码符号中任意两个位置的互换。
1. The interchange of any two positions in the code symbols.
2、代码符号中任意位置数值的补码。
2. The complementing of the values in any position in the code symbols.
这是一个正式的等价关系 ( ∼ ),因为A ∼ A;A ∼ B意味着B ∼ A;并且A ∼ B , B ∼ C意味着A ∼ C。因此,我们可以将对一类代码的研究简化为对每个等价类的典型成员的研究。就几何模型而言,等价变换相当于单位立方体的旋转和反射。
This is a formal equivalence relation (∼) since A ∼ A; A ∼ B implies B ∼ A; and A ∼ B, B ∼ C implies A ∼ C. Thus we can reduce the study of a class of codes to the study of typical members of each equivalence class. In terms of the geometric model, equivalence transformations amount to rotations and reflections of the unit cube.
本节研究的问题是将最大数量的点包装在一个单位n维立方体中,使得没有两个点彼此之间的距离小于 2 个单位。我们将证明,如第13.2节中那样,可以如此包装 2 n -1 个点,并且进一步,任何此类最佳包装等效于第13.2节中使用的最佳包装。
The problem studied in this section is that of packing the maximum number of points in a unit n-dimensional cube such that no two points are closer than 2 units from each other. We shall show that, as in §13.2, 2n−1 points can be so packed, and, further, that any such optimal packing is equivalent to that used in §13.2.
为了证明这些陈述,我们首先观察到n维立方体的顶点是由两个 ( n − 1) 维立方体的顶点组成的。令A为原始立方体中包含的最大点数。那么两个 ( n − 1) 维立方体之一至少有A/ 2 个点。这个立方体再次分解为两个低维立方体,我们发现其中一个至少有A/ 2 2 个点。继续这样,我们得到一个具有A/ 2 n -2 个点的二维立方体。我们现在观察到,一个正方形最多可以有两个点,它们之间至少有两个单位;因此,原始n维立方体最多有2 n -1个点,间隔不少于两个单位。
To prove these statements we first observe that the vertices of the n-dimensional cube are composed of those of two (n − 1)-dimensional cubes. Let A be the maximum number of points packed in the original cube. Then one of the two (n− 1)-dimensional cubes has at least A/2 points. This cube being again decomposed into two lower dimensional cubes, we find that one of them has at least A/22 points. Continuing in this way we come to a two-dimensional cube having A/2n−2 points. We now observe that a square can have at most two points separated by at least two units; hence the original n-dimensional cube had at most 2n−1 points not less than two units apart.
为了证明任何两个最佳包装的等价性,我们注意到,如果包装是最佳的,那么两个子立方体中的每一个都有一半的点。称其为第一个坐标,我们看到一半的点有 0,一半的点有 1。下一个细分将再次将它们分为两个相等的组,分别有 0 和 1。在 ( n − 1) 个这样的阶段之后,在重新排序分配的值(如果有)时,我们得到了设计的代码的前n − 1 个位置在第13.2节中。对于前n - 1 个坐标的每个序列,存在n - 1 个与其相差一个坐标的其他序列。一旦我们固定了某个点的第 n个坐标,比如全为 0 的原点,那么为了保持代码点之间两个单位的已知最小距离,所有其他代码点的第n个坐标都是唯一确定的。因此,最后一个坐标是在补码内确定的,以便任何最佳代码都等效于第13.2节中给出的代码。
To prove the equivalence of any two optimal packings we note that, if the packing is optimal, then each of the two sub-cubes has half the points. Calling this the first coordinate we see that half the points have a 0 and half have a 1. The next subdivision will again divide these into two equal groups having 0s and 1s respectively. After (n− 1) such stages we have, upon re-orderinng the assigned values if there be any, exactly the first n− 1 positions of the code devised in §13.2. To each sequence of the first n− 1 coordinates there exist n− 1 other sequences which differ from it by one co-ordinate. Once we fix the nth coordinate of some one point, say the origin which has all 0s, then to maintain the known minimum distance of two units between code points the nth coordinate is uniquely determined for all other code points. Thus the last coordinate is determined within a complementation so that any optimal code is equivalent to that given in §13.2.
有趣的是,在这两个证明中,我们仅使用了代码符号的长度均为n的假设。
It is interesting to note that in these two proofs we have used only the assumption that the code symbols are all of length n.
读者可能已经注意到,在第一部分的特定代码中,对信息位置和检查位置进行了区分,而在几何模型中,各个坐标之间没有真正的区别。为了使两种处理方式更加一致,我们重新定义了系统代码,即符号长度全部相等且
It has probably been noted by the reader that, in the particular codes of Part I, a distinction was made between information and check positions, while, in the geometric model, there is no real distinction between the various coordinates. To bring the two treatments more in line with each other we re-define a systematic code as a code whose symbol lengths are all equal and
1. 检查的位置与符号中包含的信息无关。
1. The positions checked are independent of the information contained in the symbol.
2. 检查相互独立。
2. The checks are independent of each other.
3. 我们使用奇偶校验。
3. We use parity checks.
这相当于前面的定义。为了证明这一点,我们形成一个矩阵,其第 i行在第 i个奇偶校验位置处有 1 ,在其他位置有 0。根据假设 1,矩阵是固定的并且不会因代码符号的不同而改变。从 2 开始,矩阵的秩为k。这反过来意味着系统可以求解以其他n − k 个位置表示的k个位置。假设 3 表明在该求解中我们使用 1 + 1 = 0 的算术。
This is equivalent to the earlier definition. To show this we form a matrix whose ith row has 1s in the positions of the ith parity check and 0s elsewhere. By assumption 1 the matrix is fixed and does not change from code symbol to code symbol. From 2 the rank of the matrix is k. This in turn means that the system can be solved for k of the positions expressed in terms of the other n − k positions. Assumption 3 indicates that in this solving we use the arithmetic in which 1 + 1 = 0.
存在非系统码,但迄今为止还没有发现对于给定的n和最小距离d具有比系统码更多的码符号。……
There exist non-systematic codes, but so far none have been found which for a given n and minimum distance d have more code symbols than a systematic code. …
转向本节的主要问题,我们从图 13.5中发现,单个纠错码的码点彼此至少相差三个单位。因此,每个点都可以被半径为 1 的球体包围,没有两个球体具有公共点。每个球体有一个中心点和其表面上的n个点,总共n +1个点。因此 2 n个点的空间最多可以有:
Turning to the main problem of this section we find from Figure 13.5 that a single error correcting code has code points at least three units from each other. Thus each point may be surrounded by a sphere of radius 1 with no two spheres having a point in common. Each sphere has a center point and n points on its surface, a total of n + 1 points. Thus the space of 2n points can have at most:
球体。这正是我们之前在第13.3节中找到的界限。
spheres. This is exactly the bound we found before in §13.3.
虽然我们已经证明第13.3节中构造的特定单纠错码具有最小冗余,但我们不能证明所有最优码都是等效的,因为下面的简单示例表明情况并非如此。对于n = 4,从图13.1可以看出,m = 1,k = 3。因此,四位代码中最多有两个代码符号。下面两个最优代码显然不等价:
While we have shown that the specific single error correcting code constructed in §13.3 is of minimum redundancy, we cannot show that all optimal codes are equivalent, since the following trivial example shows that this is not so. For n = 4 we find from Figure 13.1 that m = 1 and k = 3. Thus there are at most two code symbols in a four-position code. The following two optimal codes are clearly not equivalent:
在本节中,我们将证明第13.4节中构造的代码具有最小冗余度。我们已经在第13.4节中展示了如何对于最小距离为 3 的n − 1 维最小冗余码,我们可以构造一个具有相同数量的码符号但最小距离为 4 的n维码。如果这是如果不是最小冗余,则可能存在具有更多代码符号但具有相同的n和它们之间的相同的最小距离 4 的代码。使用此代码,我们删除最后一个坐标。这将维度从n减少到n − 1,并且代码符号之间的最小距离最多减少一个单位,同时代码符号的数量保持不变。这与我们开始构建的代码具有最小冗余的假设相矛盾。因此,第13.4节的代码具有最小冗余度。
In this section we shall prove that the codes constructed in §13.4 are of minimum redundancy. We have already shown in §13.4 how for a minimum redundancy code of n − 1 dimensions with a minimum distance of 3, we can construct an n dimensional code having the same number of code symbols but with a minimum distance of 4. If this were not of minimum redundancy there would exist a code having more code symbols but with the same n and the same minimum distance 4 between them. Taking this code we remove the last coordinate. This reduces the dimension from n to n − 1 and the minimum distance between code symbols by, at most, one unit, while leaving the number of code symbols the same. This contradicts the assumption that the code we began our construction with was of minimum redundancy. Thus the codes of §13.4 are of minimum redundancy.
这是以下一般定理的特例: 对于任何n − 1 维N点且最小距离为 2 k − 1 的最小冗余码,都对应有一个具有最小距离的 n 维N点最小冗余码2 k,反之亦然。为了从n − 1 维代码构建n维代码,我们只需添加单个第n坐标,该坐标通过对n个位置进行偶校验检查来固定。这也会使最小距离增加 1,原因如下:在n − 1 维代码中,彼此距离 2 k − 1 的任何两个点在其坐标之间具有奇数个差异。因此,对两个点进行相反的奇偶校验设置,将它们之间的距离增加到 2 k。附加坐标无法减少任何距离,因此代码中的所有点现在的最小距离为 2 k。要反向操作,我们只需从n维代码中删除一个坐标即可。这将最小距离 2 k减少到 2 k − 1,同时保持N不变。显然,如果一个代码具有最小冗余度,那么另一个代码也具有最小冗余度。……
This is a special case of the following general theorem: To any minimum redundancy code of N points in n− 1 dimensions and having a minimum distance of 2k − 1 there corresponds a minimum redundancy code of N points in n dimensions having a minimum distance of 2k, and conversely. To construct the n dimensional code from the n− 1 dimensional code we simply add a single nth coordinate which is fixed by an even parity check over the n positions. This also increases the minimum distance by 1 for the following reason: Any two points which, in the n − 1 dimensional code, were at a distance 2k − 1 from each other had an odd number of differences between their coordinates. Thus the parity check was set oppositely for the two points, increasing the distance between them to 2k. The additional co-ordinate could not decrease any distances, so that all points in the code are now at a minimum distance of 2k. To go in the reverse direction we simply drop one coordinate from the n dimensional code. This reduces the minimum distance of 2k to 2k − 1 while leaving N the same. It is clear that if one code is of minimum redundancy then the other is too. …
经诺基亚贝尔实验室许可,转载自 Hamming (1950)。
Reprinted from Hamming (1950), with permission from Nokia Bell Labs.
《论可计算数》出版后不久,艾伦·图灵就开始间歇性地与英国情报部门合作破译密码(有关图灵生平的更多信息,请参阅第 6 章)。英国对德宣战后,他开始在英国战时情报中心布莱切利公园全职工作。在那里,他最终成功破解了用于德国指挥部与船只和部队之间通信的恩尼格玛密码。与布莱切利公园的所有工作一样,它几十年来一直处于机密状态且未出版,但现在存在一个共识,即图灵的工作对于战争努力至关重要,并且可能大大缩短了战争时间。事实上,英国在 1946 年承认了他的贡献。
Alan Turing started intermittent work with British intelligence on codebreaking not long after publication of “On Computable Numbers” (see chapter 6 for more on Turing’s life). After the UK declared war on Germany, he began full time work at Bletchley Park, the wartime center of British intelligence. There he led an ultimately successful effort to crack the Enigma code that was being used for communications between the German command and ships and troops. Like all Bletchley Park work, it remained classified and unpublished for decades, but a consensus now exists that Turing’s work was essential to the war effort and may have shortened the war significantly. Indeed, Britain recognized him for his service in 1946.
借鉴战时经验,大约在摩尔学院小组设计 EDVAC 的同时,图灵开始在剑桥设计“自动计算引擎”,称为 ACE。尽管它最终会成为一台合适的存储程序计算机,但由于与秘密研究重叠,其构造受到繁文缛节的束缚,图灵也离开了该项目。他转而转到曼彻斯特大学,并积极参与 Mark 1 的设计(第 15 章进一步讨论——与哈佛大学艾肯的 Mark I 不同的机器)。在曼彻斯特期间,他开始想象计算机有一天能够做什么。这篇出色的论文使“图灵测试”进入了流行的说法——尽管,正如图灵在引言中明确指出的那样,他的“模仿游戏”并不是测试机器是否能够思考(图灵将这个问题描述为“毫无意义”),而是相反,这是一个科学上易于处理的替代问题。魏森鲍姆的 E LIZA程序(第 27 章)很容易引诱人们质疑图灵测试的适当性;哲学家约翰·塞尔(John Searle,1980)提出了更细致的反对它的论点。这篇论文的论点至今仍在争论中(参见 Shieber [2004] 的详细说明)。我们省略了图灵对数字计算机及其模型的普遍性的解释。
Drawing on his wartime experience and at about the same time as the Moore School group was at work designing the EDVAC, Turing began the design at Cambridge of an “Automatic Calculating Engine,” dubbed the ACE. Though it would eventually become a proper stored-program computer, its construction was tied up in red tape because of the overlap with secret research, and Turing left the project. He moved instead to the University of Manchester and became active in the design of the Mark 1 (discussed further in chapter 15—a different machine from Aiken’s Mark I at Harvard). While at Manchester he began to imagine what computers might someday be able to do. This remarkable paper caused the “Turing test” to enter popular parlance—even though, as Turing makes clear in the introduction, his “imitation game” is not a test of whether machines can think (a question Turing describes as “meaningless”) but instead a scientifically tractable substitute question. Weizenbaum’s ELIZA program (chapter 27) seduced people so readily as to challenge the appropriateness of the Turing test; the philosopher John Searle (1980) advanced a more nuanced argument against it. The paper’s arguments are still debated today (see Shieber [2004] for a thorough account). We omit Turing’s explanations of digital computers and of the universality of his model.
图灵被人们铭记不仅是因为他的数学工作,还因为他广泛的好奇心。这篇论文最初发表的《Mind》是一份重要的哲学期刊;除了描述数学自动机的普遍性的部分外,本文全文转载于此。在图灵生命的最后几年,他转向了长期以来令他着迷的数学生物学。
Turing is remembered not only for his mathematical work but for the breadth of his curiosity. Mind, where this paper was originally published, is an important philosophy journal; the paper is here reprinted in full, except for a section describing the universality of mathematical automata. In Turing’s last years he turned to mathematical biology, which had long fascinated him.
我建议考虑这个问题:“机器能思考吗?” 这应该从术语“机器”和“思考”的含义定义开始。定义的框架可以尽可能地反映词语的正常使用,但这种态度是危险的。如果要通过检查“机器”和“思考”这两个词的常用方式来找到它们的含义,就很难逃脱这样的结论:“机器能思考吗?”这个问题的含义和答案。可以通过盖洛普民意调查等统计调查来寻找。但这是荒谬的。我不会尝试这样的定义,而是用另一个与该问题密切相关并用相对明确的词语表达的问题来代替该问题。
I propose to consider the question, “Can machines think?” This should begin with definitions of the meaning of the terms “machine” and “think.” The definitions might be framed so as to reflect so far as possible the normal use of the words, but this attitude is dangerous. If the meaning of the words “machine” and “think” are to be found by examining how they are commonly used it is difficult to escape the conclusion that the meaning and the answer to the question, “Can machines think?” is to be sought in a statistical survey such as a Gallup poll. But this is absurd. Instead of attempting such a definition I shall replace the question by another, which is closely related to it and is expressed in relatively unambiguous words.
这个问题的新形式可以用我们称之为“模仿游戏”的游戏来描述。它由三个人玩,一个男人(A),一个女人(B)和一个审讯者(C),可以是任何性别。审讯者待在一个与其他两人分开的房间里。审讯者游戏的目的是确定另外两个人中哪一个是男人,哪一个是女人。他通过标签 X 和 Y 认识它们,在游戏结束时,他说“X 是 A,Y 是 B”或“X 是 B,Y 是 A”。询问者可以向 A 和 B 提出问题,如下:
The new form of the problem can be described in terms of a game which we call the “imitation game.” It is played with three people, a man (A), a woman (B), and an interrogator (C) who may be of either sex. The interrogator stays in a room apart from the other two. The object of the game for the interrogator is to determine which of the other two is the man and which is the woman. He knows them by labels X and Y, and at the end of the game he says either “X is A and Y is B” or “X is B and Y is A.” The interrogator is allowed to put questions to A and B thus:
C:X 请告诉我他或她的头发长度吗?
C: Will X please tell me the length of his or her hair?
现在假设X实际上是A,那么A必须回答。A在游戏中的目的是试图导致C做出错误的识别。因此他的答案可能是:
Now suppose X is actually A, then A must answer. It is A’s object in the game to try and cause C to make the wrong identification. His answer might therefore be:
“我的头发是木瓦状的,最长的头发约有九英寸长。”
“My hair is shingled, and the longest strands are about nine inches long.”
为了避免语气对询问者有帮助,答案应该写下来,或者更好的是打字。理想的安排是有一台电传打印机在两个房间之间进行通信。或者,中间人可以重复问题和答案。第三名玩家(B)的游戏目标是帮助审讯者。对她来说最好的策略可能是给出真实的答案。她可以加上诸如“我是女人,别听他的!”之类的话。对于她的回答,但这毫无用处,因为这个男人可以发表类似的言论。
In order that tones of voice may not help the interrogator the answers should be written, or better still, typewritten. The ideal arrangement is to have a teleprinter communicating between the two rooms. Alternatively the question and answers can be repeated by an intermediary. The object of the game for the third player (B) is to help the interrogator. The best strategy for her is probably to give truthful answers. She can add such things as “I am the woman, don’t listen to him!” to her answers, but it will avail nothing as the man can make similar remarks.
我们现在问一个问题:“当机器在这个游戏中扮演 A 的角色时,会发生什么?” 当这样的游戏进行时,审问者会像在男人和女人之间进行游戏时一样经常做出错误的决定吗?这些问题取代了我们原来的“机器能思考吗?”
We now ask the question, “What will happen when a machine takes the part of A in this game?” Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and a woman? These questions replace our original, “Can machines think?”
除了问“这个新问题的答案是什么”之外,人们还可能会问“这个新问题值得研究吗?” 我们不再费事地研究后一个问题,从而缩短无限回归。
As well as asking, “What is the answer to this new form of the question,” one may ask, “Is this new question a worthy one to investigate?” This latter question we investigate without further ado, thereby cutting short an infinite regress.
这个新问题的优点是在人的体力和智力之间划出了相当清晰的界线。没有工程师或化学家声称能够生产出与人类皮肤没有区别的材料。也许有一天这会实现,但即使假设这项发明可用,我们也应该觉得试图通过用这样的人造肉来装扮“思考机器”来使其更加人性化是没有意义的。表格在我们所设置的问题反映了这一事实,即阻止询问者看到或接触其他参赛者,或听到他们的声音。所提出的标准的一些其他优点可以通过样本问题和答案来显示。因此:
The new problem has the advantage of drawing a fairly sharp line between the physical and the intellectual capacities of a man. No engineer or chemist claims to be able to produce a material which is indistinguishable from the human skin. It is possible that at some time this might be done, but even supposing this invention available we should feel there was little point in trying to make a “thinking machine” more human by dressing it up in such artificial flesh. The form in which we have set the problem reflects this fact in the condition which prevents the interrogator from seeing or touching the other competitors, or hearing their voices. Some other advantages of the proposed criterion may be shown up by specimen questions and answers. Thus:
问:请给我写一首以福斯桥为主题的十四行诗。
Q: Please write me a sonnet on the subject of the Forth Bridge.
答:这件事就别指望我了。我从来不会写诗。
A: Count me out on this one. I never could write poetry.
问:将 34957 添加到 70764。
Q: Add 34957 to 70764.
答:(停顿约30秒后给出答案)105621。
A: (Pause about 30 seconds and then give as answer) 105621.
问:你会下棋吗?
Q: Do you play chess?
答:是的。
A: Yes.
问:我的 K1 上有 K,没有其他棋子。K6 处只有 K,R1 处只有 R。这是你的举动。你玩什么?
Q: I have K at my K1, and no other pieces. You have only K at K6 and R at R1. It is your move. What do you play?
A:(停顿15秒后)R-R8伙伴。
A: (After a pause of 15 seconds) R-R8 mate.
问答法似乎适合介绍我们希望涵盖的人类努力的几乎任何一个领域。我们不想因为机器在选美比赛中表现不佳而惩罚它,也不想因为在与飞机的比赛中失败而惩罚一个人。我们的游戏条件使这些残疾变得无关紧要。如果“证人”认为合适,他们可以随意吹嘘自己的魅力、力量或英雄主义,但审讯者不能要求实际的示范。
The question and answer method seems to be suitable for introducing almost any one of the fields of human endeavour that we wish to include. We do not wish to penalise the machine for its inability to shine in beauty competitions, nor to penalise a man for losing in a race against an aeroplane. The conditions of our game make these disabilities irrelevant. The “witnesses” can brag, if they consider it advisable, as much as they please about their charms, strength or heroism, but the interrogator cannot demand practical demonstrations.
该游戏可能会受到批评,因为赔率对机器的影响太大。如果这个人试图假装成机器,他的表现显然会很糟糕。他很快就会因为算术的缓慢和不准确而出卖自己。机器难道不能执行一些应该被描述为思考但与人所做的事情截然不同的事情吗?这个反对意见是非常强烈的,但至少我们可以说,如果可以构造一台机器来令人满意地玩模仿游戏,我们就不必为这个反对意见所困扰。
The game may perhaps be criticised on the ground that the odds are weighted too heavily against the machine. If the man were to try and pretend to be the machine he would clearly make a very poor showing. He would be given away at once by slowness and inaccuracy in arithmetic. May not machines carry out something which ought to be described as thinking but which is very different from what a man does? This objection is a very strong one, but at least we can say that if, nevertheless, a machine can be constructed to play the imitation game satisfactorily, we need not be troubled by this objection.
有人可能会提出,在玩“模仿游戏”时,机器的最佳策略可能不是模仿人类的行为。可能是这样,但我认为不太可能有这样大的效果。无论如何,这里无意研究游戏理论,并且会假设最好的策略是尝试提供男人自然给出的答案。
It might be urged that when playing the “imitation game” the best strategy for the machine may possibly be something other than imitation of the behaviour of a man. This may be, but I think it is unlikely that there is any great effect of this kind. In any case there is no intention to investigate here the theory of the game, and it will be assumed that the best strategy is to try to provide answers that would naturally be given by a man.
在我们明确“机器”一词的含义之前,我们在第14.1节中提出的问题不会很明确。很自然,我们希望允许各种工程技术在我们的机器中使用。我们还希望允许工程师或工程师团队建造一台可以工作的机器,但其构造者无法令人满意地描述其操作方式,因为他们应用了一种很大程度上是实验性的方法。最后,我们希望将正常出生的人排除在机器之外。很难制定定义来满足这三个条件。一个可能例如,坚持认为工程师团队应该全部是同一种性别,但这并不能真正令人满意,因为很可能从一个男人的皮肤细胞(比如说)中培育出一个完整的个体。做到这一点将是生物技术的一项壮举,值得最高的赞誉,但我们不会倾向于将其视为“建造一台思考机器”的案例。这促使我们放弃应该允许每种技术的要求。鉴于当前对“思考机器”的兴趣是由一种通常称为“电子计算机”或“数字计算机”的特殊机器引起的,我们更愿意这样做。根据此建议,我们只允许数字计算机参与我们的游戏。
The question which we put in §14.1 will not be quite definite until we have specified what we mean by the word “machine.” It is natural that we should wish to permit every kind of engineering technique to be used in our machines. We also wish to allow the possibility that an engineer or team of engineers may construct a machine which works, but whose manner of operation cannot be satisfactorily described by its constructors because they have applied a method which is largely experimental. Finally, we wish to exclude from the machines men born in the usual manner. It is difficult to frame the definitions so as to satisfy these three conditions. One might for instance insist that the team of engineers should be all of one sex, but this would not really be satisfactory, for it is probably possible to rear a complete individual from a single cell of the skin (say) of a man. To do so would be a feat of biological technique deserving of the very highest praise, but we would not be inclined to regard it as a case of “constructing a thinking machine.” This prompts us to abandon the requirement that every kind of technique should be permitted. We are the more ready to do so in view of the fact that the present interest in “thinking machines” has been aroused by a particular kind of machine, usually called an “electronic computer” or “digital computer.” Following this suggestion we only permit digital computers to take part in our game.
乍一看,这一限制似乎非常严厉。我将试图证明现实并非如此。为此,需要简要介绍这些计算机的性质和特性。也可以说,这种将机器等同于数字计算机的做法,就像我们对“思考”的标准一样,只有在数字计算机无法在游戏中表现出色的情况下(与我的信念相反),才不会令人满意。 。
This restriction appears at first sight to be a very drastic one. I shall attempt to show that it is not so in reality. To do this necessitates a short account of the nature and properties of these computers. It may also be said that this identification of machines with digital computers, like our criterion for “thinking,” will only be unsatisfactory if (contrary to my belief), it turns out that digital computers are unable to give a good showing in the game.
已经有许多数字计算机处于正常工作状态,人们可能会问:“为什么不立即尝试实验呢?满足游戏条件很容易。可以使用多个询问器,并编制统计数据以显示正确识别的频率。” 简而言之,我们不是在问是否所有数字计算机都会在游戏中表现出色,也不是在问现有的计算机是否会表现出色,而是在问是否有可以想象的计算机会表现出色。但这只是简短的答案。稍后我们将从不同的角度看待这个问题。
There are already a number of digital computers in working order, and it may be asked, “Why not try the experiment straight away? It would be easy to satisfy the conditions of the game. A number of interrogators could be used, and statistics compiled to show how often the right identification was given.” The short answer is that we are not asking whether all digital computers would do well in the game nor whether the computers at present available would do well, but whether there are imaginable computers which would do well. But this is only the short answer. We shall see this question in a different light later.
数字计算机背后的想法可以这样解释:这些机器旨在执行人类计算机可以完成的任何操作。人类计算机应该遵循固定的规则;他无权在任何细节上偏离这些规定。我们可以假设这些规则在一本书中提供,每当他从事新工作时,这本书就会改变。他还有无限量的纸张供他进行计算。他也可以在“台式机器”上进行乘法和加法,但这并不重要。……
The idea behind digital computers may be explained by saying that these machines are intended to carry out any operations which could be done by a human computer. The human computer is supposed to be following fixed rules; he has no authority to deviate from them in any detail. We may suppose that these rules are supplied in a book, which is altered whenever he is put on to a new job. He has also an unlimited supply of paper on which he does his calculations. He may also do his multiplications and additions on a “desk machine,” but this is not important. …
数字计算机的想法由来已久。1828 年至 1839 年间担任剑桥卢卡斯数学教授的查尔斯·巴贝奇 (Charles Babbage) 计划设计这样一种机器,称为分析机,但它从未完成。尽管巴贝奇拥有所有基本的想法,但他的机器在当时并不是一个非常有吸引力的前景。可用的速度肯定比人类计算机快,但比曼彻斯特机器慢 100 倍,曼彻斯特机器本身就是最慢的现代机器之一。存储是纯机械的,使用轮子和卡片。
The idea of a digital computer is an old one. Charles Babbage, Lucasian Professor of Mathematics at Cambridge from 1828 to 1839, planned such a machine, called the Analytical Engine, but it was never completed. Although Babbage had all the essential ideas, his machine was not at that time such a very attractive prospect. The speed which would have been available would be definitely faster than a human computer but something like 100 times slower than the Manchester machine, itself one of the slower of the modern machines. The storage was to be purely mechanical, using wheels and cards.
巴贝奇的分析机是完全机械的这一事实将帮助我们摆脱迷信。人们常常重视这样一个事实:现代数字计算机是电气的,神经系统也是电气的。由于巴贝奇的机器不是电动的,并且所有数字计算机在某种意义上都是等效的,因此我们看到这种使用电不可能具有理论上的重要性。当然,电力通常出现在涉及快速信号传输的地方,因此我们在这两种连接中发现电力也就不足为奇了。在神经系统中,化学现象至少与电现象一样重要。在某些计算机中,存储系统主要是声学的。因此,使用电的特征被认为只是非常表面的相似之处。如果我们希望找到这种相似之处,我们应该寻找函数的数学类比。
The fact that Babbage’s Analytical Engine was to be entirely mechanical will help us to rid ourselves of a superstition. Importance is often attached to the fact that modern digital computers are electrical, and that the nervous system also is electrical. Since Babbage’s machine was not electrical, and since all digital computers are in a sense equivalent, we see that this use of electricity cannot be of theoretical importance. Of course electricity usually comes in where fast signalling is concerned, so that it is not surprising that we find it in both these connections. In the nervous system chemical phenomena are at least as important as electrical. In certain computers the storage system is mainly acoustic. The feature of using electricity is thus seen to be only a very superficial similarity. If we wish to find such similarities we should look rather for mathematical analogies of function.
上一节中考虑的数字计算机可以归类为“离散状态机”。这些机器通过突然跳跃或点击从一种相当确定的状态移动到另一种状态。这些状态差异很大,因此可以忽略它们之间混淆的可能性。严格来说,不存在这样的机器。一切都在不断地运动。但是有很多种机器可以被认为是离散状态机,这是有利的。例如,在考虑照明系统的开关时,一个方便的假设是每个开关必须绝对打开或绝对关闭。必须有中间位置,但对于大多数目的,我们可以忘记它们。… [编辑:省略了普遍性的论点]
The digital computers considered in the last section may be classified amongst the “discrete-state machines.” These are the machines which move by sudden jumps or clicks from one quite definite state to another. These states are sufficiently different for the possibility of confusion between them to be ignored. Strictly speaking there are no such machines. Everything really moves continuously. But there are many kinds of machine which can profitably be thought of as being discrete-state machines. For instance in considering the switches for a lighting system it is a convenient fiction that each switch must be definitely on or definitely off. There must be intermediate positions, but for most purposes we can forget about them. …[EDITOR: Argument for universality omitted]
数字计算机的这种特殊属性,即它们可以模仿任何离散状态机,可以通过说它们是通用机器来描述。具有这种特性的机器的存在具有重要的后果,即除了考虑速度之外,没有必要设计各种新机器来执行各种计算过程。它们都可以通过一台数字计算机来完成,并针对每种情况进行适当的编程。由此可见,所有数字计算机在某种意义上都是等效的。
This special property of digital computers, that they can mimic any discrete-state machine, is described by saying that they are universal machines. The existence of machines with this property has the important consequence that, considerations of speed apart, it is unnecessary to design various new machines to do various computing processes. They can all be done with one digital computer, suitably programmed for each case. It will be seen that as a consequence of this all digital computers are in a sense equivalent.
我们现在可以再次考虑第14.3节末尾提出的观点。有人试探性地提出了“机器能思考吗?”这个问题。应该替换为“是否有可以想象的数字计算机在模仿游戏中表现出色?” 如果我们希望的话,我们可以让这个表面上更加普遍,并问“是否有可以做得很好的离散状态机?” 但鉴于普遍性,我们看到这两个问题都等价于:“让我们把注意力集中在一台特定的数字计算机 C 上。是否真的可以通过修改这台计算机使其拥有足够的存储空间,适当提高其速度?”并为其提供适当的程序,是否可以使C在模仿游戏中令人满意地扮演A的角色,而B的角色则由人扮演?”
We may now consider again the point raised at the end of §14.3. It was suggested tentatively that the question, “Can machines think?” should be replaced by “Are there imaginable digital computers which would do well in the imitation game?” If we wish we can make this superficially more general and ask “Are there discrete-state machines which would do well?” But in view of the universality property we see that either of these questions is equivalent to this, “Let us fix our attention on one particular digital computer C. Is it true that by modifying this computer to have an adequate storage, suitably increasing its speed of action, and providing it with an appropriate programme, C can be made to play satisfactorily the part of A in the imitation game, the part of B being taken by a man?”
现在我们可以认为基础已经清理干净,我们准备好继续讨论我们的问题:“机器能思考吗?” 以及上一节末尾引用的它的变体。我们不能完全放弃问题的原始形式,因为对于替代的适当性会有不同的意见,我们至少必须倾听在这方面必须说的话。如果我首先解释一下我自己对此事的看法,这将使读者的问题变得简单。首先考虑问题的更准确形式。我相信,在大约五十年的时间里,将有可能对存储容量约为 10 9的计算机进行编程,使它们能够很好地玩模仿游戏,以至于一般询问者做出的机会不会超过 70%五分钟询问后正确辨认。最初的问题是“机器能思考吗?” 我认为这太没有意义了,不值得讨论。尽管如此,我相信,到本世纪末,词语的使用和一般受过教育的观点将会发生很大的变化,以至于人们将能够谈论机器思维而不会被反驳。我进一步相信,隐藏这些信念不会有任何有用的目的。流行的观点认为,科学家们无情地从一个既定的事实走向另一个既定的事实,从不受到任何改进的猜想的影响,这是完全错误的。只要分清哪些是已证实的事实,哪些是推测,就不会造成任何损害。猜想非常重要,因为它们提出了有用的研究方向。
We may now consider the ground to have been cleared and we are ready to proceed to the debate on our question, “Can machines think?” and the variant of it quoted at the end of the last section. We cannot altogether abandon the original form of the problem, for opinions will differ as to the appropriateness of the substitution and we must at least listen to what has to be said in this connexion. It will simplify matters for the reader if I explain first my own beliefs in the matter. Consider first the more accurate form of the question. I believe that in about fifty years’ time it will be possible to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning. The original question, “Can machines think?” I believe to be too meaningless to deserve discussion. Nevertheless I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted. I believe further that no useful purpose is served by concealing these beliefs. The popular view that scientists proceed inexorably from well-established fact to well-established fact, never being influenced by any improved conjecture, is quite mistaken. Provided it is made clear which are proved facts and which are conjectures, no harm can result. Conjectures are of great importance since they suggest useful lines of research.
我现在开始考虑与我自己相反的意见。
I now proceed to consider opinions opposed to my own.
我无法接受其中的任何部分,但会尝试用神学术语来回答。如果将动物与人类归为一类,我会发现这个论点更有说服力,因为在我看来,典型的有生命和无生命之间的差异比人和其他动物之间的差异更大。如果我们考虑到其他宗教团体的成员可能会如何看待正统观点的任意性,那么它的任意性就会变得更加清晰。基督徒如何看待穆斯林认为妇女没有灵魂的观点?[编辑:Smith 和 Haddad(1975,脚注 2)阐述了这种关于伊斯兰教的误解的根源。]但是让我们把这一点放在一边,回到主要论点。在我看来,上面引用的论点意味着对全能者的全能的严重限制。诚然,有些事情他不能做,比如让一等于二,但我们难道不应该相信,如果他认为合适的话,他有自由将灵魂赋予大象吗?我们可能期望他只会在与突变结合使用这种力量,这种突变为大象提供了适当改进的大脑来满足此类需求。对于机器的情况,可以提出完全相同形式的论证。它可能看起来不同,因为它更难以“吞咽”。但这实际上只意味着我们认为他不太可能考虑适合赋予灵魂的环境。本文的其余部分将讨论所讨论的情况。在试图建造这样的机器时,我们不应该不敬地篡夺他创造灵魂的力量,就像我们在生育孩子时一样:相反,无论哪种情况,我们都是他意志的工具,为他创造的灵魂提供住所。
I am unable to accept any part of this, but will attempt to reply in theological terms. I should find the argument more convincing if animals were classed with men, for there is a greater difference, to my mind, between the typical animate and the inanimate than there is between man and the other animals. The arbitrary character of the orthodox view becomes clearer if we consider how it might appear to a member of some other religious community. How do Christians regard the Moslem view that women have no souls? [EDITOR: Smith and Haddad (1975, footnote 2) address the source of this misconception about Islam.] But let us leave this point aside and return to the main argument. It appears to me that the argument quoted above implies a serious restriction of the omnipotence of the Almighty. It is admitted that there are certain things that He cannot do such as making one equal to two, but should we not believe that He has freedom to confer a soul on an elephant if He sees fit? We might expect that He would only exercise this power in conjunction with a mutation which provided the elephant with an appropriately improved brain to minister to the needs of this sort. An argument of exactly similar form may be made for the case of machines. It may seem different because it is more difficult to “swallow.” But this really only means that we think it would be less likely that He would consider the circumstances suitable for conferring a soul. The circumstances in question are discussed in the rest of this paper. In attempting to construct such machines we should not be irreverently usurping His power of creating souls, any more than we are in the procreation of children: rather we are, in either case, instruments of His will providing mansions for the souls that He creates.
然而,这只是猜测。我对神学论证不太感兴趣,无论它们用来支持什么。过去,此类论点经常被认为不能令人满意。在伽利略时代,人们争论这些经文:“太阳停着……大约整整一天,也不急速落下”(约书亚记 10:13)和“他奠定了大地的根基,使大地得以安定”。它不应在任何时候移动”(诗篇 105:5),这是对哥白尼理论的充分反驳。以我们目前的知识,这样的论点似乎是徒劳的。当无法获得这些知识时,就会给人留下完全不同的印象。
However, this is mere speculation. I am not very impressed with theological arguments whatever they may be used to support. Such arguments have often been found unsatisfactory in the past. In the time of Galileo it was argued that the texts, “And the sun stood still … and hasted not to go down about a whole day” (Joshua 10:13) and “He laid the foundations of the earth, that it should not move at any time” (Psalm 105:5) were an adequate refutation of the Copernican theory. With our present knowledge such an argument appears futile. When that knowledge was not available it made a quite different impression.
这种论点很少像上面的形式那样公开表达。但它影响了我们大多数思考这个问题的人。我们愿意相信,人类在某些微妙的方面优于其他造物。最好能证明他必然具有优越性,这样他就不会有失去指挥地位的危险。神学论证的流行显然与这种感觉有关。这种力量在知识分子中可能相当强烈,因为他们比其他人更看重思考的力量,并且更倾向于将他们对人类优越性的信念建立在这种力量之上。
This argument is seldom expressed quite so openly as in the form above. But it affects most of us who think about it at all. We like to believe that Man is in some subtle way superior to the rest of creation. It is best if he can be shown to be necessarily superior, for then there is no danger of him losing his commanding position. The popularity of the theological argument is clearly connected with this feeling. It is likely to be quite strong in intellectual people, since they value the power of thinking more highly than others, and are more inclined to base their belief in the superiority of Man on this power.
我认为这个论点没有足够的实质性需要反驳。安慰会更合适:也许这应该在灵魂的轮回中寻求。
I do not think that this argument is sufficiently substantial to require refutation. Consolation would be more appropriate: perhaps this should be sought in the transmigration of souls.
对这一论点的简短回答是,尽管已经确定任何特定机器的能力都受到限制,但只是在没有任何证据的情况下指出,这种限制不适用于人类智力。但我认为这种观点不能轻易被驳回。每当其中一台机器被问到适当的关键问题并给出明确的答案时,我们就知道这个答案一定是错误的,这给了我们一定的优越感。这种感觉是不是很虚幻?毫无疑问,这是非常真实的,但我认为不应该过分重视它。我们自己经常对问题给出错误的答案,因此没有理由对机器的这种错误证据感到非常高兴。此外,只有在这种情况下,我们才能够在与我们取得微弱胜利的那台机器相关的情况下感受到我们的优越性。不存在同时战胜所有机器的问题。简而言之,可能有人比任何特定机器更聪明,但也可能有其他机器更聪明,依此类推。
The short answer to this argument is that although it is established that there are limitations to the powers of any particular machine, it has only been stated, without any sort of proof, that no such limitations apply to the human intellect. But I do not think this view can be dismissed quite so lightly. Whenever one of these machines is asked the appropriate critical question, and gives a definite answer, we know that this answer must be wrong, and this gives us a certain feeling of superiority. Is this feeling illusory? It is no doubt quite genuine, but I do not think too much importance should be attached to it. We too often give wrong answers to questions ourselves to be justified in being very pleased at such evidence of fallibility on the part of the machines. Further, our superiority can only be felt on such an occasion in relation to the one machine over which we have scored our petty triumph. There would be no question of triumphing simultaneously over all machines. In short, then, there might be men cleverer than any given machine, but then again there might be other machines cleverer again, and so on.
我认为,那些坚持数学论证的人大多愿意接受模仿游戏作为讨论的基础。那些相信前两种反对意见的人可能对任何标准都不感兴趣。
Those who hold to the mathematical argument would, I think, mostly be willing to accept the imitation game as a basis for discussion. Those who believe in the two previous objections would probably not be interested in any criteria.
这个论点似乎否定了我们测试的有效性。根据这种观点的最极端形式,人们可以确定机器会思考的唯一方法就是成为机器并感觉自己在思考。然后,人们可以向世界描述这些感受,但当然没有人有理由去注意。同样,根据这种观点,了解一个人的想法的唯一方法就是成为那个特定的人。这实际上是唯我论的观点。这可能是最符合逻辑的观点,但它使思想交流变得困难。A 很可能相信“A 认为但 B 不认为”,而 B 相信“B 认为但 A 不认为”。通常不会在这一点上不断争论,而是采用每个人都认为的礼貌惯例。
This argument appears to be a denial of the validity of our test. According to the most extreme form of this view the only way by which one could be sure that a machine thinks is to be the machine and to feel oneself thinking. One could then describe these feelings to the world, but of course no one would be justified in taking any notice. Likewise according to this view the only way to know that a man thinks is to be that particular man. It is in fact the solipsist point of view. It may be the most logical view to hold but it makes communication of ideas difficult. A is liable to believe “A thinks but B does not” whilst B believes “B thinks but A does not.” Instead of arguing continually over this point it is usual to have the polite convention that everyone thinks.
我相信杰斐逊教授并不想采取极端和唯我主义的观点。或许他会很愿意接受模仿游戏作为考验。该游戏(省略玩家 B)在实践中经常以“viva voce”的名义使用,以发现某人是否真正理解某件事或“鹦鹉学舌般地学会了它”。让我们来听听这样一个活生生的声音的一部分:
I am sure that Professor Jefferson does not wish to adopt the extreme and solipsist point of view. Probably he would be quite willing to accept the imitation game as a test. The game (with the player B omitted) is frequently used in practice under the name of viva voce to discover whether some one really understands something or has “learnt it parrot fashion.” Let us listen in to a part of such a viva voce:
提问者:在你的十四行诗的第一行中,“我可以将你比作夏日”,“春天”不是也同样好或者更好吗?
Interrogator: In the first line of your sonnet which reads “Shall I compare thee to a summer’s day,” would not “a spring day” do as well or better?
证人:它不会扫描。
Witness: It wouldn’t scan.
提问者:“冬天的一天”怎么样?这样就可以扫描了。
Interrogator: How about “a winter’s day”? That would scan all right.
证人:是的,但没有人愿意与冬日进行比较。
Witness: Yes, but nobody wants to be compared to a winter’s day.
询问者:你会说匹克威克先生让你想起了圣诞节吗?
Interrogator: Would you say Mr. Pickwick reminded you of Christmas?
证人:在某种程度上。
Witness: In a way.
询问者:然而圣诞节是冬天,我不认为匹克威克先生会介意这种比较。
Interrogator: Yet Christmas is a winter’s day, and I do not think Mr. Pickwick would mind the comparison.
证人:我不认为你是认真的。冬日是指一个典型的冬日,而不是像圣诞节这样特殊的一天。
Witness: I don’t think you’re serious. By a winter’s day one means a typical winter’s day, rather than a special one like Christmas.
等等。如果十四行诗写作机器能够如此大声地回答,杰斐逊教授会怎么说?我不知道他是否会认为机器“只是人为地发出信号”这些答案,但如果答案像上面的段落一样令人满意和持续,我认为他不会将其描述为“一个简单的发明”。我认为,这个短语的目的是涵盖这样的设备,例如在机器中包含某人阅读十四行诗的记录,并时不时地通过适当的开关将其打开。
And so on. What would Professor Jefferson say if the sonnet-writing machine was able to answer like this in the viva voce? I do not know whether he would regard the machine as “merely artificially signalling” these answers, but if the answers were as satisfactory and sustained as in the above passage I do not think he would describe it as “an easy contrivance.” This phrase is, I think, intended to cover such devices as the inclusion in the machine of a record of someone reading a sonnet, with appropriate switching to turn it on from time to time.
简而言之,我认为大多数支持意识论证的人都可以被说服放弃它,而不是被迫采取唯我论立场。然后他们可能会愿意接受我们的测试。
In short then, I think that most of those who support the argument from consciousness could be persuaded to abandon it rather than be forced into the solipsist position. They will then probably be willing to accept our test.
我不想给人留下这样的印象:我认为意识并不神秘。例如,任何将其本地化的尝试都存在某种悖论。但我认为,在我们回答本文所关心的问题之前,不一定需要解决这些谜团。
I do not wish to give the impression that I think there is no mystery about consciousness. There is, for instance, something of a paradox connected with any attempt to localise it. But I do not think these mysteries necessarily need to be solved before we can answer the question with which we are concerned in this paper.
善良、足智多谋、美丽、友善、有主动性、有幽默感、明辨是非、犯错误、坠入爱河、享受草莓和奶油、让某人爱上它、从经验中学习、使用语言正确地,成为自己思想的主题,像一个人一样具有多样性的行为,做一些真正新的事情。
Be kind, resourceful, beautiful, friendly, have initiative, have a sense of humour, tell right from wrong, make mistakes, fall in love, enjoy strawberries and cream, make some one fall in love with it, learn from experience, use words properly, be the subject of its own thought, have as much diversity of behaviour as a man, do something really new.
通常不会为这些陈述提供任何支持。我相信它们大多是建立在科学归纳原理的基础上的。一个人一生中见过数以千计的机器。从他对它们的观察中,他得出了一些一般性的结论。它们很丑陋,每一个都是为非常有限的目的而设计的,当需要用于微小不同的目的时它们毫无用处,其中任何一个的行为多样性都非常小,等等。自然地,他得出结论,这些是必要的属性机器的一般情况。其中许多限制与大多数机器的存储容量非常小有关。(我假设存储容量的概念以某种方式扩展到涵盖离散状态机以外的机器。确切的定义并不重要,因为当前的讨论中没有声称数学准确性。)几年前,当非常人们对数字计算机知之甚少,如果人们提到它们的特性而不描述它们的结构,可能会引起人们对它们的极大怀疑。这大概是由于科学归纳原理的类似应用。当然,这些原则的应用很大程度上是无意识的。当一个被烧伤的孩子害怕火,并通过躲避火来表明他害怕火时,我应该说他正在应用科学归纳法。(我当然也可以用许多其他方式来描述他的行为。)人类的作品和习俗似乎不是非常适合应用科学归纳法的材料。如果要获得可靠的结果,就必须研究很大一部分时空。否则我们可能(就像大多数英国孩子一样)认为每个人都说英语,而学习法语是愚蠢的。
No support is usually offered for these statements. I believe they are mostly founded on the principle of scientific induction. A man has seen thousands of machines in his lifetime. From what he sees of them he draws a number of general conclusions. They are ugly, each is designed for a very limited purpose, when required for a minutely different purpose they are useless, the variety of behaviour of any one of them is very small, etc., etc. Naturally he concludes that these are necessary properties of machines in general. Many of these limitations are associated with the very small storage capacity of most machines. (I am assuming that the idea of storage capacity is extended in some way to cover machines other than discrete-state machines. The exact definition does not matter as no mathematical accuracy is claimed in the present discussion.) A few years ago, when very little had been heard of digital computers, it was possible to elicit much incredulity concerning them, if one mentioned their properties without describing their construction. That was presumably due to a similar application of the principle of scientific induction. These applications of the principle are of course largely unconscious. When a burnt child fears the fire and shows that he fears it by avoiding it, I should say that he was applying scientific induction. (I could of course also describe his behaviour in many other ways.) The works and customs of mankind do not seem to be very suitable material to which to apply scientific induction. A very large part of space-time must be investigated, if reliable results are to be obtained. Otherwise we may (as most English children do) decide that everybody speaks English, and that it is silly to learn French.
然而,对于所提到的许多残疾,还需要特别说明。无法享受草莓和奶油可能会让读者觉得很无聊。也许可以制造一台机器来享用这道美味佳肴,但任何试图让机器这样做的尝试都是愚蠢的。这种缺陷的重要之处在于,它会导致其他一些缺陷,例如,导致人与机器之间、白人与白人之间或黑人与黑人之间难以建立友好关系。
There are, however, special remarks to be made about many of the disabilities that have been mentioned. The inability to enjoy strawberries and cream may have struck the reader as frivolous. Possibly a machine might be made to enjoy this delicious dish, but any attempt to make one do so would be idiotic. What is important about this disability is that it contributes to some of the other disabilities, e.g., to the difficulty of the same kind of friendliness occurring between man and machine as between white man and white man, or between black man and black man.
“机器不会犯错误”的说法似乎很奇怪。人们很想反驳:“他们因此而变得更糟吗?” 但让我们采取一种更加同情的态度,并尝试了解其真正含义。我认为这种批评可以用模仿游戏来解释。据称,审讯者只需向机器提出一些算术问题,就可以将机器与人区分开来。这台机器将因其致命的准确性而被揭露。对此的回答很简单。机器(为玩游戏而编程)不会尝试给出算术问题的正确答案。它会故意引入错误,以迷惑审讯者。机械故障可能会通过不适当的决定来表现出来,即在算术中犯了什么样的错误。即使这种对批评的解释也不够同情。但我们没有足够的篇幅来进一步探讨这个问题。在我看来,这种批评是基于对两种错误的混淆。我们可以称它们为“运作错误”和“结论错误”。功能错误是由于某些机械或电气故障导致机器的行为与设计目的不同。在哲学讨论中,人们喜欢忽略出现此类错误的可能性。因此,我们正在讨论“抽象机器”。这些抽象机器是数学虚构的而不是物理对象。根据定义,它们不会出现运行错误。从这个意义上说,我们确实可以说“机器永远不会犯错误”。只有当机器的输出信号具有某种含义时,才会出现结论错误。例如,机器可以输入数学方程或英语句子。当输入一个假命题时,我们说机器犯了结论错误。显然没有任何理由说机器不会犯这种错误。它可能除了重复输入“0 = 1”之外什么也不做。举个不那么反常的例子,它可能有某种通过科学归纳得出结论的方法。我们必须预料到这种方法偶尔会导致错误的结果。
The claim that “machines cannot make mistakes” seems a curious one. One is tempted to retort, “Are they any the worse for that?” But let us adopt a more sympathetic attitude, and try to see what is really meant. I think this criticism can be explained in terms of the imitation game. It is claimed that the interrogator could distinguish the machine from the man simply by setting them a number of problems in arithmetic. The machine would be unmasked because of its deadly accuracy. The reply to this is simple. The machine (programmed for playing the game) would not attempt to give the right answers to the arithmetic problems. It would deliberately introduce mistakes in a manner calculated to confuse the interrogator. A mechanical fault would probably show itself through an unsuitable decision as to what sort of a mistake to make in the arithmetic. Even this interpretation of the criticism is not sufficiently sympathetic. But we cannot afford the space to go into it much further. It seems to me that this criticism depends on a confusion between two kinds of mistake. We may call them “errors of functioning” and “errors of conclusion.” Errors of functioning are due to some mechanical or electrical fault which causes the machine to behave otherwise than it was designed to do. In philosophical discussions one likes to ignore the possibility of such errors; one is therefore discussing “abstract machines.” These abstract machines are mathematical fictions rather than physical objects. By definition they are incapable of errors of functioning. In this sense we can truly say that “machines can never make mistakes.” Errors of conclusion can only arise when some meaning is attached to the output signals from the machine. The machine might, for instance, type out mathematical equations, or sentences in English. When a false proposition is typed we say that the machine has committed an error of conclusion. There is clearly no reason at all for saying that a machine cannot make this kind of mistake. It might do nothing but type out repeatedly “0 = 1.” To take a less perverse example, it might have some method for drawing conclusions by scientific induction. We must expect such a method to lead occasionally to erroneous results.
当然,只有能够证明机器对某些主题具有某种思想,才能回答机器不能成为其自身思想的主题的主张。尽管如此,“机器操作的主题”似乎确实有意义,至少对于处理它的人来说是这样。例如,如果机器试图找到方程x 2 − 40 x − 11 = 0 的解,那么人们会很想将该方程描述为当时机器主题的一部分。从这个意义上说,机器无疑可以成为它自己的主题。它可用于帮助制定自己的程序,或预测其自身结构改变的影响。通过观察自身行为的结果,它可以修改自己的程序,从而更有效地实现某种目的。这些都是不久的将来的可能性,而不是乌托邦的梦想。
The claim that a machine cannot be the subject of its own thought can of course only be answered if it can be shown that the machine has some thought with some subject matter. Nevertheless, “the subject matter of a machine’s operations” does seem to mean something, at least to the people who deal with it. If, for instance, the machine was trying to find a solution of the equation x2 − 40x − 11 = 0 one would be tempted to describe this equation as part of the machine’s subject matter at that moment. In this sort of sense a machine undoubtedly can be its own subject matter. It may be used to help in making up its own programmes, or to predict the effect of alterations in its own structure. By observing the results of its own behaviour it can modify its own programmes so as to achieve some purpose more effectively. These are possibilities of the near future, rather than Utopian dreams.
对机器不能有太多行为多样性的批评只是说它不能有太多存储容量。直到最近,即使是一千位数的存储容量也是非常罕见的。
The criticism that a machine cannot have much diversity of behaviour is just a way of saying that it cannot have much storage capacity. Until fairly recently a storage capacity of even a thousand digits was very rare.
我们在这里考虑的批评往往是来自意识的论证的变相形式。通常,如果一个人坚持认为机器可以做其中一件事情,并描述机器可以使用的方法,那么人们不会给人留下太多印象。人们认为这种方法(无论它是什么,因为它必须是机械的)确实相当基础。比较第 154 页[编辑:本卷]引用的杰斐逊声明中的括号。
The criticisms that we are considering here are often disguised forms of the argument from consciousness. Usually if one maintains that a machine can do one of these things, and describes the kind of method that the machine could use, one will not make much of an impression. It is thought that the method (whatever it may be, for it must be mechanical) is really rather base. Compare the parentheses in Jefferson’s statement quoted on page 154 [EDITOR: of this volume].
在这一点上我完全同意哈特里的观点。值得注意的是,他并没有断言相关机器没有获得该财产,而是洛夫莱斯夫人掌握的证据并不鼓励她相信它们拥有该财产。从某种意义上说,所讨论的机器很可能具有这种特性。假设某个离散状态机具有该属性。分析引擎是一种通用数字计算机,因此,如果其存储容量和速度足够,则可以通过适当的编程来模仿相关机器。也许伯爵夫人和巴贝奇都没有想到这个论点。无论如何,他们没有义务索取所有可以索取的东西。
I am in thorough agreement with Hartree over this. It will be noticed that he does not assert that the machines in question had not got the property, but rather that the evidence available to Lady Lovelace did not encourage her to believe that they had it. It is quite possible that the machines in question had in a sense got this property. For suppose that some discrete-state machine has the property. The Analytical Engine was a universal digital computer, so that, if its storage capacity and speed were adequate, it could by suitable programming be made to mimic the machine in question. Probably this argument did not occur to the Countess or to Babbage. In any case there was no obligation on them to claim all that could be claimed.
整个问题将在学习机器的标题下再次考虑。
This whole question will be considered again under the heading of learning machines.
洛夫莱斯夫人的反对意见的一个变体是,机器“永远无法做任何真正新的事情”。这或许可以用锯子来暂时回避:“太阳底下并无新事。” 谁能肯定他所做的“原创性工作”不只是教学中种下的种子的生长,也不是遵循众所周知的一般原则的结果。这种反对意见的一个更好的变体是,机器永远不会“让我们措手不及”。这个说法是一个更直接的挑战,可以直接应对。机器经常让我感到惊讶。这很大程度上是因为我没有进行足够的计算来决定期望他们做什么,或者更确切地说,因为虽然我做了计算,但我是仓促、草率、冒险的。也许我对自己说,“我想这里的电压应该和那里一样:无论如何,让我们假设它是一样的。” 当然,我经常犯错,结果令我惊讶,因为当实验完成时,这些假设已经被忘记了。这些坦白让我愿意接受有关我的邪恶行为的讲座,但当我证明我所经历的惊喜时,请不要对我的可信度产生任何怀疑。
A variant of Lady Lovelace’s objection states that a machine can “never do anything really new.” This may be parried for a moment with the saw, “There is nothing new under the sun.” Who can be certain that “original work” that he has done was not simply the growth of the seed planted in him by teaching, or the effect of following well-known general principles. A better variant of the objection says that a machine can never “take us by surprise.” This statement is a more direct challenge and can be met directly. Machines take me by surprise with great frequency. This is largely because I do not do sufficient calculation to decide what to expect them to do, or rather because, although I do a calculation, I do it in a hurried, slipshod fashion, taking risks. Perhaps I say to myself, “I suppose the voltage here ought to be the same as there: anyway let’s assume it is.” Naturally I am often wrong, and the result is a surprise for me for by the time the experiment is done these assumptions have been forgotten. These admissions lay me open to lectures on the subject of my vicious ways, but do not throw any doubt on my credibility when I testify to the surprises I experience.
我并不指望这个答复能让我的批评者闭嘴。他可能会说,这样的惊喜是我的一些创造性的心理行为造成的,并不反映机器的功劳。这让我们回到了意识的论证,而不是惊讶的想法。我们必须考虑结束这一论证,但也许值得一提的是,对令人惊讶的事物的欣赏需要同样多的“创造性的心理行为”,无论令人惊讶的事件源自一个人、一本书、一台机器还是任何东西别的。
I do not expect this reply to silence my critic. He will probably say that such surprises are due to some creative mental act on my part, and reflect no credit on the machine. This leads us back to the argument from consciousness, and far from the idea of surprise. It is a line of argument we must consider closed, but it is perhaps worth remarking that the appreciation of something as surprising requires as much of a “creative mental act” whether the surprising event originates from a man, a book, a machine or anything else.
我认为,机器不能带来惊喜的观点是由于哲学家和数学家尤其容易犯的一个谬论。这是一种假设,一旦一个事实呈现在人们的脑海中,该事实的所有后果都会同时浮现在脑海中。在许多情况下,这是一个非常有用的假设,但人们很容易忘记它是错误的。这样做的一个自然结果是,人们会认为仅仅根据数据和一般原则得出结论是没有任何好处的。
The view that machines cannot give rise to surprises is due, I believe, to a fallacy to which philosophers and mathematicians are particularly subject. This is the assumption that as soon as a fact is presented to a mind all consequences of that fact spring into the mind simultaneously with it. It is a very useful assumption under many circumstances, but one too easily forgets that it is false. A natural consequence of doing so is that one then assumes that there is no virtue in the mere working out of consequences from data and general principles.
确实,离散状态机必定不同于连续状态机。但如果我们遵守模仿游戏的条件,审讯者将无法利用这种差异。如果我们考虑其他一些更简单的连续机器,情况就会更清楚。微分分析仪会做得很好。(微分分析器是一种特定类型的机器,不属于用于某些计算的离散状态类型。)其中一些以类型化形式提供答案,因此适合参加游戏。数字计算机不可能准确预测微分分析仪会对问题给出什么答案,但它完全有能力给出正确的答案。例如,如果要求给出π的值(实际上约为 3.1416),则可以合理地在值 3.12、3.13、3.14、3.15、3.16 之间随机选择,概率为 0.05、0.15、0.55、0.19、0.06(说)。在这些情况下,询问器很难区分差分分析仪和数字计算机。
It is true that a discrete-state machine must be different from a continuous machine. But if we adhere to the conditions of the imitation game, the interrogator will not be able to take any advantage of this difference. The situation can be made clearer if we consider some other simpler continuous machine. A differential analyser will do very well. (A differential analyser is a certain kind of machine not of the discrete-state type used for some kinds of calculation.) Some of these provide their answers in a typed form, and so are suitable for taking part in the game. It would not be possible for a digital computer to predict exactly what answers the differential analyser would give to a problem, but it would be quite capable of giving the right sort of answer. For instance, if asked to give the value of π (actually about 3.1416) it would be reasonable to choose at random between the values 3.12, 3.13, 3.14, 3.15, 3.16 with the probabilities of 0.05, 0.15, 0.55, 0.19, 0.06 (say). Under these circumstances it would be very difficult for the interrogator to distinguish the differential analyser from the digital computer.
由此可见,我们不能成为机器。我将尝试重现这个论点,但我担心我很难公正地表达它。它似乎运行类似这样的东西。“如果每个人都有一套明确的行为规则来规范自己的生活,那么他就和一台机器没什么两样。但没有这样的规则,所以人不能成为机器。” 未分布的中间很刺眼。我不认为这个论证是这样提出的,但我相信这就是所使用的论证。然而,“行为规则”和“行为法则”之间可能存在一定的混淆,从而使问题变得模糊。我所说的“行为规则”是指诸如“看到红灯就停下来”这样的戒律,人们可以根据这些戒律采取行动,也可以意识到这些戒律。我所说的“行为法则”是指应用于男人身体的自然法则,例如“如果你捏他,他就会发出吱吱声”。如果我们用引用的论点中的“规范他的生活的行为法则”代替“他规范他的生活的行为法则”,那么未分配的中间就不再是不可克服的。因为我们相信,受行为法则的监管不仅意味着成为某种机器(尽管不一定是离散状态机),而且反过来,成为这样的机器也意味着受到这些法则的监管。然而,我们不能轻易地让自己相信,不存在完整的行为法则,也不存在完整的行为规则。我们知道找到这些定律的唯一方法是科学观察,而且我们当然不知道在任何情况下我们都可以说:“我们已经探索得足够多了。没有这样的法律。”
From this it is argued that we cannot be machines. I shall try to reproduce the argument, but I fear I shall hardly do it justice. It seems to run something like this. “If each man had a definite set of rules of conduct by which he regulated his life he would be no better than a machine. But there are no such rules, so men cannot be machines.” The undistributed middle is glaring. I do not think the argument is ever put quite like this, but I believe this is the argument used nevertheless. There may however be a certain confusion between “rules of conduct” and “laws of behaviour” to cloud the issue. By “rules of conduct” I mean precepts such as “Stop if you see red lights,” on which one can act, and of which one can be conscious. By “laws of behaviour” I mean laws of nature as applied to a man’s body such as “if you pinch him he will squeak.” If we substitute “laws of behaviour which regulate his life” for “laws of conduct by which he regulates his life” in the argument quoted the undistributed middle is no longer insuperable. For we believe that it is not only true that being regulated by laws of behaviour implies being some sort of machine (though not necessarily a discrete-state machine), but that conversely being such a machine implies being regulated by such laws. However, we cannot so easily convince ourselves of the absence of complete laws of behaviour as of complete rules of conduct. The only way we know of for finding such laws is scientific observation, and we certainly know of no circumstances under which we could say, “We have searched enough. There are no such laws.”
我们可以更有力地证明任何此类言论都是不合理的。假设我们可以确定找到这样的定律(如果它们存在的话)。那么,给定一个离散状态机,当然应该可以通过对其进行充分的观察来发现它以预测其未来的行为,并且这是在合理的时间内,例如一千年。但事实似乎并非如此。我在曼彻斯特计算机上设置了一个仅使用 1,000 个存储单元的小程序,其中提供一个十六位数的机器在两秒内回复另一个十六位数。我反对任何人从这些回复中充分了解该程序,以便能够预测对未经尝试的值的任何回复。
We can demonstrate more forcibly that any such statement would be unjustified. For suppose we could be sure of finding such laws if they existed. Then given a discrete-state machine it should certainly be possible to discover by observation sufficient about it to predict its future behaviour, and this within a reasonable time, say a thousand years. But this does not seem to be the case. I have set up on the Manchester computer a small programme using only 1,000 units of storage, whereby the machine supplied with one sixteen-figure number replies with another within two seconds. I would defy anyone to learn from these replies sufficient about the programme to be able to predict any replies to untried values.
基于 ESP 的更具体的论证可能如下:“让我们玩模仿游戏,使用一个擅长心灵感应接收器的人和一台数字计算机作为证人。询问者可以提出诸如“我右手中的牌属于什么花色?”之类的问题。该男子通过心灵感应或千里眼,在 400 张卡片中给出了 130 次正确答案。机器只能随机猜测,也许能猜对 104,所以询问器就能做出正确的识别。” 这里有一个有趣的可能性。假设数字计算机包含一个随机数生成器。然后很自然地用它来决定给出什么答案。但随后随机数生成器将受到询问器的心理动力的影响。也许这种念力可能会导致机器比概率计算预期的更频繁地猜测正确,因此询问者可能仍然无法做出正确的识别。另一方面,他或许可以凭借千里眼,不加任何质疑地猜出正确答案。有了 ESP,任何事情都可能发生。
A more specific argument based on E.S.P. might run as follows: “Let us play the imitation game, using as witnesses a man who is good as a telepathic receiver, and a digital computer. The interrogator can ask such questions as ‘What suit does the card in my right hand belong to?’ The man by telepathy or clairvoyance gives the right answer 130 times out of 400 cards. The machine can only guess at random, and perhaps gets 104 right, so the interrogator makes the right identification.” There is an interesting possibility which opens here. Suppose the digital computer contains a random number generator. Then it will be natural to use this to decide what answer to give. But then the random number generator will be subject to the psychokinetic powers of the interrogator. Perhaps this psychokinesis might cause the machine to guess right more often than would be expected on a probability calculation, so that the interrogator might still be unable to make the right identification. On the other hand, he might be able to guess right without any questioning, by clairvoyance. With E.S.P. anything may happen.
如果心灵感应被承认,我们就有必要加强测试。这种情况可以被视为类似于如果审讯者自言自语而其中一名参赛者将耳朵贴在墙上听的情况。将参赛者放入“防心灵感应室”就可以满足所有要求。
If telepathy is admitted it will be necessary to tighten our test up. The situation could be regarded as analogous to that which would occur if the interrogator were talking to himself and one of the competitors was listening with his ear to the wall. To put the competitors into a “telepathy-proof room” would satisfy all requirements.
读者会预料到我没有非常令人信服的积极论据来支持我的观点。如果我有的话,我就不应该如此费力地指出相反观点的谬误。我现在将提供我所掌握的证据。
The reader will have anticipated that I have no very convincing arguments of a positive nature to support my views. If I had I should not have taken such pains to point out the fallacies in contrary views. Such evidence as I have I shall now give.
让我们暂时回到洛夫莱斯夫人的反对意见,她指出机器只能做我们告诉它做的事情。可以说,人可以将一个想法“注入”到机器中,机器会做出一定程度的反应,然后陷入静止,就像琴弦被锤子敲击一样。另一个比喻是小于临界尺寸的原子堆:注入的想法对应于从外部进入原子堆的中子。每个这样的中子都会引起一定的扰动,最终消失。然而,如果堆的尺寸足够大,则由这种入射中子引起的扰动很可能会不断增加,直到整个堆被破坏。心灵是否有相应的现象,机器是否也有相应的现象?似乎确实有一个适合人类心灵的东西。它们中的大多数似乎是“亚临界的”,即,在这个类比中对应于亚临界尺寸的堆。向这样的头脑提出的一个想法平均会产生少于一个的回应。一小部分是超临界的。向这样的头脑提出的想法可能会产生由二级、三级和更遥远的想法组成的整个“理论”。动物的思想似乎绝对是亚批判的。秉承这个类比,我们会问:“机器可以变得超临界吗?”
Let us return for a moment to Lady Lovelace’s objection, which stated that the machine can only do what we tell it to do. One could say that a man can “inject” an idea into the machine, and that it will respond to a certain extent and then drop into quiescence, like a piano string struck by a hammer. Another simile would be an atomic pile of less than critical size: an injected idea is to correspond to a neutron entering the pile from without. Each such neutron will cause a certain disturbance which eventually dies away. If, however, the size of the pile is sufficiently increased, the disturbance caused by such an incoming neutron will very likely go on and on increasing until the whole pile is destroyed. Is there a corresponding phenomenon for minds, and is there one for machines? There does seem to be one for the human mind. The majority of them seem to be “subcritical,” i.e., to correspond in this analogy to piles of subcritical size. An idea presented to such a mind will on average give rise to less than one idea in reply. A smallish proportion are supercritical. An idea presented to such a mind that may give rise to a whole “theory” consisting of secondary, tertiary and more remote ideas. Animals minds seem to be very definitely subcritical. Adhering to this analogy we ask, “Can a machine be made to be supercritical?”
“洋葱皮”的比喻也很有帮助。在考虑心灵或大脑的功能时,我们发现某些操作可以用纯粹的机械术语来解释。我们所说的这并不符合真实的心灵:如果我们要找到真实的心灵,我们就必须剥去它的一层皮。但在剩下的部分中,我们发现了更多的皮肤需要被剥掉,等等。以这种方式进行下去,我们是否会到达“真正的”心灵,或者我们最终是否会到达里面空无一物的皮肤?在后一种情况下,整个头脑都是机械的。(然而,它不会是离散状态机。我们已经讨论过这一点。)
The “skin-of-an-onion” analogy is also helpful. In considering the functions of the mind or the brain we find certain operations which we can explain in purely mechanical terms. This we say does not correspond to the real mind: it is a sort of skin which we must strip off if we are to find the real mind. But then in what remains we find a further skin to be stripped off, and so on. Proceeding in this way do we ever come to the “real” mind, or do we eventually come to the skin which has nothing in it? In the latter case the whole mind is mechanical. (It would not be a discrete-state machine however. We have discussed this.)
最后两段并没有声称是令人信服的论据。它们更应该被描述为“倾向于产生信仰的背诵”。
These last two paragraphs do not claim to be convincing arguments. They should rather be described as “recitations tending to produce belief.”
对于第14.6节开头表达的观点,唯一真正令人满意的支持是等待本世纪末,然后进行所描述的实验。但与此同时我们能说什么?如果实验要成功,现在应该采取什么步骤?
The only really satisfactory support that can be given for the view expressed at the beginning of §14.6, will be that provided by waiting for the end of the century and then doing the experiment described. But what can we say in the meantime? What steps should be taken now if the experiment is to be successful?
正如我所解释的,问题主要是编程问题。工程方面也必须取得进步,但这些似乎不太可能满足要求。对大脑存储容量的估计从 10 10到 10 15 个二进制数字不等。我倾向于较低的值,并相信只有很小一部分用于较高类型的思维。其中大部分可能用于保留视觉印象,如果需要超过 10 个9才能令人满意地玩模仿游戏(至少是针对盲人),我会感到惊讶。(注:大英百科全书第11版的容量为2×10 9 。)即使通过现有技术,10 7的存储容量也是非常可行的可能性。可能根本没有必要提高机器的运行速度。现代机器的某些部分可以被视为神经细胞的类似物,其工作速度比神经细胞快大约一千倍。这应该提供一个“安全边际”,可以弥补许多方面引起的速度损失。我们的问题是找出如何对这些机器进行编程来玩游戏。按照我目前的工作速度,我每天大约生成一千位数的程序,这样,如果没有什么东西被扔进废纸篓的话,大约六十名工人,稳定地工作五十年,就可以完成这项工作。一些更快捷的方法似乎是可取的。
As I have explained, the problem is mainly one of programming. Advances in engineering will have to be made too, but it seems unlikely that these will not be adequate for the requirements. Estimates of the storage capacity of the brain vary from 1010 to 1015 binary digits. I incline to the lower values and believe that only a very small fraction is used for the higher types of thinking. Most of it is probably used for the retention of visual impressions, I should be surprised if more than 109 was required for satisfactory playing of the imitation game, at any rate against a blind man. (Note: The capacity of the Encyclopaedia Britannica, 11th edition, is 2 × 109.) A storage capacity of 107, would be a very practicable possibility even by present techniques. It is probably not necessary to increase the speed of operations of the machines at all. Parts of modern machines which can be regarded as analogs of nerve cells work about a thousand times faster than the latter. This should provide a “margin of safety” which could cover losses of speed arising in many ways. Our problem then is to find out how to programme these machines to play the game. At my present rate of working I produce about a thousand digits of programme a day, so that about sixty workers, working steadily through the fifty years might accomplish the job, if nothing went into the wastepaper basket. Some more expeditious method seems desirable.
在尝试模仿成年人思维的过程中,我们必然会深入思考将其带到当前状态的过程。我们可能会注意到三个组成部分。
In the process of trying to imitate an adult human mind we are bound to think a good deal about the process which has brought it to the state that it is in. We may notice three components.
(a) 最初的心理状态,比如出生时,
(a) The initial state of the mind, say at birth,
(b) 所受的教育,
(b) The education to which it has been subjected,
(c) 不属于教育的其他经历。
(c) Other experience, not to be described as education, to which it has been subjected.
与其尝试制作一个程序来模拟成人的思维,为什么不尝试制作一个模拟孩子的思维呢?如果对其进行适当的教育,人们将获得成人大脑。据推测,儿童大脑就像是从文具店购买的笔记本。机制相当少,还有很多空白纸。(从我们的角度来看,机制和写作几乎是同义词。)我们希望儿童大脑中的机制如此之少,以至于可以轻松地对类似的东西进行编程。这作为初步近似,我们可以假设教育工作量与人类儿童的教育工作量大致相同。
Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child’s? If this were then subjected to an appropriate course of education one would obtain the adult brain. Presumably the child brain is something like a notebook as one buys it from the stationer’s. Rather little mechanism, and lots of blank sheets. (Mechanism and writing are from our point of view almost synonymous.) Our hope is that there is so little mechanism in the child brain that something like it can be easily programmed. The amount of work in the education we can assume, as a first approximation, to be much the same as for the human child.
因此,我们将问题分为两部分。儿童计划和教育过程。这两者仍然保持着非常密切的联系。我们不能指望一次就能找到一台好的子机。人们必须尝试教授一台这样的机器,看看它的学习效果如何。然后,人们可以尝试另一种,看看它是更好还是更差。通过识别,这个过程和进化之间存在明显的联系
We have thus divided our problem into two parts. The child programme and the education process. These two remain very closely connected. We cannot expect to find a good child machine at the first attempt. One must experiment with teaching one such machine and see how well it learns. One can then try another and see if it is better or worse. There is an obvious connection between this process and evolution, by the identifications
[编辑:第三个方程的两边已从原始论文中的位置调换,因此所有进化项都在右侧。]然而,人们可能希望这一过程比进化更迅速。优胜劣汰是一种衡量优势的缓慢方法。实验者通过运用智力,应该能够加快速度。同样重要的是,他并不局限于随机突变。如果他能找出某种弱点的原因,他可能就能想到哪种突变可以改善这种弱点。
[EDITOR: The sides of the third equation have been swapped from their positions in the original paper, so that all the evolutionary terms are on the right.] One may hope, however, that this process will be more expeditious than evolution. The survival of the fittest is a slow method for measuring advantages. The experimenter, by the exercise of intelligence, should be able to speed it up. Equally important is the fact that he is not restricted to random mutations. If he can trace a cause for some weakness he can probably think of the kind of mutation which will improve it.
不可能对机器应用与正常儿童完全相同的教学过程。例如,它不会配备腿,因此无法要求它出去填充煤斗。可能它没有眼睛。但是,无论巧妙的工程可以很好地克服这些缺陷,人们还是无法将这种生物送到学校而不引起其他孩子的过度取笑。必须给它一些学费。我们不必太关心腿、眼睛等。海伦·凯勒小姐的例子表明,只要老师和学生之间能够通过某种方式进行双向交流,教育就可以发生。我们通常将惩罚和奖励与教学过程联系起来。一些简单的子机器可以根据这种原理构建或编程。机器的构造必须使得惩罚信号发生之前不久的事件不太可能重复,而奖励信号则增加了导致其发生的事件重复的可能性。这些定义并没有预设机器的任何感受,我用一台这样的儿童机器做了一些实验,并成功地教了它一些东西,但教学方法太非正统,实验不能被认为是真正成功的。
It will not be possible to apply exactly the same teaching process to the machine as to a normal child. It will not, for instance, be provided with legs, so that it could not be asked to go out and fill the coal scuttle. Possibly it might not have eyes. But however well these deficiencies might be overcome by clever engineering, one could not send the creature to school without the other children making excessive fun of it. It must be given some tuition. We need not be too concerned about the legs, eyes, etc. The example of Miss Helen Keller shows that education can take place provided that communication in both directions between teacher and pupil can take place by some means or other. We normally associate punishments and rewards with the teaching process. Some simple child machines can be constructed or programmed on this sort of principle. The machine has to be so constructed that events which shortly preceded the occurrence of a punishment signal are unlikely to be repeated, whereas a reward signal increased the probability of repetition of the events which led up to it. These definitions do not presuppose any feelings on the part of the machine, I have done some experiments with one such child machine, and succeeded in teaching it a few things, but the teaching method was too unorthodox for the experiment to be considered really successful.
惩罚和奖励的使用最多只能成为教学过程的一部分。粗略地说,如果老师没有其他方式与学生沟通,他所能获得的信息量不会超过所施加的奖励和惩罚的总数。当孩子学会重复“卡萨比安卡”时,如果只能通过“二十个问题”技巧来发现文本,并且每个“不”都以打击的形式出现,那么他可能会感到非常痛苦。因此,有必要有一些其他“非情感”的沟通渠道。如果这些可用,就可以通过惩罚来教导机器服从以某种语言(例如符号语言)发出的命令的奖励。这些命令将通过“非情感”渠道传达。使用这种语言将大大减少所需的惩罚和奖励的数量。
The use of punishments and rewards can at best be a part of the teaching process. Roughly speaking, if the teacher has no other means of communicating to the pupil, the amount of information which can reach him does not exceed the total number of rewards and punishments applied. By the time a child has learnt to repeat “Casabianca” he would probably feel very sore indeed, if the text could only be discovered by a “Twenty Questions” technique, every “NO” taking the form of a blow. It is necessary therefore to have some other “unemotional” channels of communication. If these are available it is possible to teach a machine by punishments and rewards to obey orders given in some language, e.g., a symbolic language. These orders are to be transmitted through the “unemotional” channels. The use of this language will diminish greatly the number of punishments and rewards required.
对于适合子机的复杂性,意见可能会有所不同。人们可能会尝试使其尽可能简单并符合一般原则。或者,人们可能“内置”一套完整的逻辑推理系统。在后一种情况下,商店将主要被定义和命题占据。这些命题将具有各种状态,例如,已确定的事实、猜想、数学证明的定理、权威给出的陈述、具有命题逻辑形式但不具有信念值的表达式。某些主张可以被描述为“势在必行”。机器的构造应该使得一旦命令被归类为“充分建立”,就会自动发生适当的操作。为了说明这一点,假设老师对机器说:“现在做作业。” 这可能会导致“老师说‘现在做作业’”被包含在既定事实中。另一个这样的事实可能是,“老师说的一切都是真的。” 将这些结合起来,最终可能会导致“现在做作业”这个命令被纳入既定事实之中,而这,通过机器的构造,将意味着作业实际上开始了,而且效果非常令人满意。机器使用的推理过程不必满足最严格的逻辑学家的要求。例如,可能没有类型的层次结构。但这并不意味着会发生类型谬误,就像我们一定会从没有围栏的悬崖上摔下来一样。适当的命令(在系统内表达,不构成系统规则的一部分),例如“不要使用某个类,除非它是老师提到过的类的子类”,可以具有与“不要去”类似的效果。太靠近边缘了。”
Opinions may vary as to the complexity which is suitable in the child machine. One might try to make it as simple as possible consistently with the general principles. Alternatively one might have a complete system of logical inference “built in.” In the latter case the store would be largely occupied with definitions and propositions. The propositions would have various kinds of status, e.g., well-established facts, conjectures, mathematically proved theorems, statements given by an authority, expressions having the logical form of proposition but not belief-value. Certain propositions may be described as “imperatives.” The machine should be so constructed that as soon as an imperative is classed as “well established” the appropriate action automatically takes place. To illustrate this, suppose the teacher says to the machine, “Do your homework now.” This may cause “Teacher says ‘Do your homework now”’ to be included amongst the well-established facts. Another such fact might be, “Everything that teacher says is true.” Combining these may eventually lead to the imperative, “Do your homework now,” being included amongst the well-established facts, and this, by the construction of the machine, will mean that the homework actually gets started, but the effect is very satisfactory. The processes of inference used by the machine need not be such as would satisfy the most exacting logicians. There might for instance be no hierarchy of types. But this need not mean that type fallacies will occur, any more than we are bound to fall over unfenced cliffs. Suitable imperatives (expressed within the systems, not forming part of the rules of the system) such as “Do not use a class unless it is a subclass of one which has been mentioned by teacher” can have a similar effect to “Do not go too near the edge.”
没有四肢的机器可以遵守的命令必然具有相当智力的特征,如上面给出的例子(做作业)。在这些命令中,重要的是规范相关逻辑系统的规则的应用顺序的命令。因为在使用逻辑系统的每个阶段,都有大量的可供选择的步骤,只要遵守逻辑系统的规则,就可以使用其中的任何一个步骤。这些选择决定了聪明的推理者和愚蠢的推理者之间的区别,而不是健全的推理者和错误的推理者之间的区别。导致此类祈使句的命题可能是“当提到苏格拉底时,使用芭芭拉的三段论”或“如果一种方法已被证明比另一种方法更快,则不要使用较慢的方法”。其中一些可能是“由权威给出的”,但另一些可能是由机器本身产生的,例如通过科学归纳。学习机的想法对于一些读者来说可能显得自相矛盾。机器的操作规则如何改变?他们应该完整地描述机器将如何反应,无论它的历史是什么,无论它可能经历什么变化。因此,这些规则是相当不随时间变化的。这是千真万确的。对这个悖论的解释是,在学习过程中改变的规则是一种不那么自命不凡的规则,只声称具有短暂的有效性。读者可以将其与美国宪法进行比较。
The imperatives that can be obeyed by a machine that has no limbs are bound to be of a rather intellectual character, as in the example (doing homework) given above. Important amongst such imperatives will be ones which regulate the order in which the rules of the logical system concerned are to be applied. For at each stage when one is using a logical system, there is a very large number of alternative steps, any of which one is permitted to apply, so far as obedience to the rules of the logical system is concerned. These choices make the difference between a brilliant and a footling reasoner, not the difference between a sound and a fallacious one. Propositions leading to imperatives of this kind might be “When Socrates is mentioned, use the syllogism in Barbara” or “If one method has been proved to be quicker than another, do not use the slower method.” Some of these may be “given by authority,” but others may be produced by the machine itself, e.g. by scientific induction. The idea of a learning machine may appear paradoxical to some readers. How can the rules of operation of the machine change? They should describe completely how the machine will react whatever its history might be, whatever changes it might undergo. The rules are thus quite time-invariant. This is quite true. The explanation of the paradox is that the rules which get changed in the learning process are of a rather less pretentious kind, claiming only an ephemeral validity. The reader may draw a parallel with the Constitution of the United States.
学习机的一个重要特征是,它的老师通常对内部发生的事情一无所知,尽管他仍然能够在某种程度上预测学生的行为。这最适用于对由经过良好设计(或程序)的子机器产生的机器的后续教育。这与使用机器进行计算时的正常程序形成鲜明对比,然后,一个人的目标是在计算中的每个时刻对机器的状态有一个清晰的印象。这个目标只有通过斗争才能实现。面对这种情况,“机器只能做我们知道如何命令它做的事情”的观点显得很奇怪。我们放入机器中的大多数程序都会导致机器做出一些我们根本无法理解的事情,或者我们认为完全随机的行为。智能行为可能是与计算中完全受纪律的行为的偏离,但这种偏离相当轻微,不会产生随机行为或无意义的重复循环。通过教学过程让我们的机器在模仿游戏中做好准备的另一个重要结果是,“人类的错误”很可能以一种相当自然的方式被省略,即无需特殊的“指导”。(读者应该将这一点与第 156 页的观点相一致。)习得的过程并不能产生百分百确定的结果;如果他们这样做了,他们就不可能被遗忘。
An important feature of a learning machine is that its teacher will often be very largely ignorant of quite what is going on inside, although he may still be able to some extent to predict his pupil’s behavior. This should apply most strongly to the later education of a machine arising from a child machine of well-tried design (or programme). This is in clear contrast with normal procedure when using a machine to do computations one’s object is then to have a clear mental picture of the state of the machine at each moment in the computation. This object can only be achieved with a struggle. The view that “the machine can only do what we know how to order it to do,” appears strange in face of this. Most of the programmes which we can put into the machine will result in its doing something that we cannot make sense of at all, or which we regard as completely random behaviour. Intelligent behaviour presumably consists in a departure from the completely disciplined behaviour involved in computation, but a rather slight one, which does not give rise to random behaviour, or to pointless repetitive loops. Another important result of preparing our machine for its part in the imitation game by a process of teaching and learning is that “human fallibility” is likely to be omitted in a rather natural way, i.e., without special “coaching.” (The reader should reconcile this with the point of view on page 156.) Processes that are learnt do not produce a hundred per cent certainty of result; if they did they could not be unlearnt.
在学习机中包含随机元素可能是明智的。当我们寻找某个问题的解决方案时,随机元素非常有用。举例来说,假设我们想要找到一个 50 到 200 之间的数字,该数字等于其数字之和的平方,我们可以从 51 开始,然后尝试 52 并继续,直到找到一个有效的数字。或者,我们可以随机选择数字,直到找到一个好的数字。这种方法的优点是不必跟踪已尝试过的值,但缺点是可能会尝试相同的值两次,但如果有多个解决方案,则这不是很重要。系统方法的缺点是在必须首先研究的区域中可能存在没有任何解的巨大块。现在,学习过程可以被视为寻找一种能够满足老师(或其他标准)的行为形式。由于可能存在大量令人满意的解决方案,因此随机方法似乎比系统方法更好。应该注意的是,它被用在类似的进化过程中。但在那里,系统化的方法是不可能的。如何记录已经尝试过的不同基因组合,以避免再次尝试?
It is probably wise to include a random element in a learning machine. A random element is rather useful when we are searching for a solution of some problem. Suppose for instance we wanted to find a number between 50 and 200 which was equal to the square of the sum of its digits, we might start at 51 then try 52 and go on until we got a number that worked. Alternatively we might choose numbers at random until we got a good one. This method has the advantage that it is unnecessary to keep track of the values that have been tried, but the disadvantage that one may try the same one twice, but this is not very important if there are several solutions. The systematic method has the disadvantage that there may be an enormous block without any solutions in the region which has to be investigated first. Now the learning process may be regarded as a search for a form of behaviour which will satisfy the teacher (or some other criterion). Since there is probably a very large number of satisfactory solutions the random method seems to be better than the systematic. It should be noticed that it is used in the analogous process of evolution. But there the systematic method is not possible. How could one keep track of the different genetical combinations that had been tried, so as to avoid trying them again?
我们可能希望机器最终能够在所有纯智力领域与人类竞争。但最好从哪些开始呢?即使这是一个艰难的决定。许多人认为非常抽象的活动(例如下棋)是最好的。也可以认为,最好为机器提供金钱能买到的最好的感觉器官,然后教它理解和说英语。这个过程可以遵循孩子的正常教学。事情会被指出并命名,等等。我再次不知道正确的答案是什么,但我认为这两种方法都应该尝试。
We may hope that machines will eventually compete with men in all purely intellectual fields. But which are the best ones to start with? Even this is a difficult decision. Many people think that a very abstract activity, like the playing of chess, would be best. It can also be maintained that it is best to provide the machine with the best sense organs that money can buy, and then teach it to understand and speak English. This process could follow the normal teaching of a child. Things would be pointed out and named, etc. Again I do not know what the right answer is, but I think both approaches should be tried.
我们只能看到前方一小段距离,但我们可以看到还有很多需要做的事情。
We can only see a short distance ahead, but we can see plenty there that needs to be done.
经牛津大学出版社许可,转载自图灵 (1950)。
Reprinted from Turing (1950), with permission from Oxford University Press.
1946年,时任英国剑桥大学数学实验室主任的莫里斯·威尔克斯(Maurice Wilkes,1913-2010)参加了宾夕法尼亚大学摩尔学院授课的暑期学校,了解了伯克斯、戈德斯坦、冯·诺依曼等人。在回程中,他开始设计一台被他称为电子延迟存储自动计算器(EDSAC)的机器。到 1949 年,当 EDSAC 启动并运行时,一台名为 Mark 1 的存储程序计算机已经在英国曼彻斯特大学投入运行——威尔克斯在这篇文章中和图灵在第 14 章(第 150 页和第 150 页)中都提到了这个项目。 159)。曼彻斯特 Mark 1 更像是一个实验原型,而不是计算主力(后续机器于 1951 年投入商业生产,名为 Ferranti Mark 1)。
In 1946, Maurice Wilkes (1913–2010), then head of the Mathematics Laboratory at the University of Cambridge in England, attended a summer school taught at the Moore School of the University of Pennsylvania and learned about the EDVAC of Burks, Goldstine, von Neumann, et al. On the return voyage, he began the design of a machine he dubbed the Electronic Delay Storage Automatic Calculator, or EDSAC. By 1949, when the EDSAC was up and running, a stored-program computer called the Mark 1 was already operational in England at the University of Manchester—a project mentioned both by Wilkes in this piece and by Turing in chapter 14 (pages 150 and 159). The Manchester Mark 1 served more as an experimental prototype than as a computational workhorse (a follow-on machine was put into commercial production in 1951 as the Ferranti Mark 1).
相比之下,EDSAC 很快就被用作剑桥科学界的资源,威尔克斯开始研制后续机器。当他设计数据格式和指令集时,他意识到,为寄存器内移位或寄存器之间移动位等微操作设计一个更原始的微指令集要容易得多,然后将实际的机器指令实现为这些微指令的微程序——正如他后来所说的那样,“为控制单元提供了微型编程计算机的全部灵活性”(Wilkes,1986)。在本文的结尾,他甚至想象程序员有一天可能能够选择自己的指令集。
The EDSAC, by contrast, was soon being used to capacity as a resource to the Cambridge scientific community, and Wilkes began work on a successor machine. As he designed the data formats and instruction set, he came to the crucial realization that it would be far easier to design a more primitive micro-instruction set for such micro-operations as shifting bits within a register or moving bits between registers, and then to implement the actual machine instructions as micro-programs of those micro-instructions—“giving the control unit the full flexibility of a programmed computer in miniature,” as he later put it (Wilkes, 1986). At the end of this article he even imagines that programmers might some day be able to choose their own instruction sets.
微代码确实很快被认为是设计计算机的“最佳方式”。在 20 世纪 50 年代后期,IBM 在其 System/360 系列计算机的设计中采用了微代码,极大地简化了调试任务,甚至可以现场修改乘法等复杂运算的设计。灵活、分层、可更新的设计理念是现代系统中普遍使用固件的背后原因。
Microcode was indeed quickly recognized as the “best way” to design a computer. Later in the 1950s, IBM adopted microcode for the design of its System/360 line of computers, greatly easing the task of debugging and even field modifying the design of complex operations such as multiplication. The idea of flexible, hierarchical, updatable design is behind the ubiquitous use of firmware in modern systems.
威尔克斯后来在剑桥大学担任教授,取得了杰出的职业生涯,并撰写了最早的计算机编程教科书之一。由于他对该领域的诸多贡献,他于 1967 年获得了图灵奖。
Wilkes went on to a distinguished career as a professor at Cambridge and authored one of the earliest textbooks on computer programming. He received the Turing Award in 1967 for his many contributions to the field.
......我认为大多数人都会同意,目前设计师首先考虑的是如何使他的机器实现最大程度的可靠性。除其他因素外,机器的可靠性取决于以下因素:
…I think that most people will agree that the first consideration for a designer at the present time is how he is to achieve the maximum degree of reliability in his machine. Amongst other things the reliability of the machine will depend on the following:
(a) 其包含的设备数量。
(a) The amount of equipment it contains.
(b) 其复杂性。
(b) Its complexity.
(c) 单元的重复程度。
(c) The degree of repetition of units.
我所说的机器的复杂性是指各个单元之间的交叉连接模糊其逻辑相互关系的程度。如果一台机器由许多单元以简单的方式连接在一起而没有交叉连接,那么它就更容易维修;它也更容易构建,因为不同的人可以在不同的单元上工作而不会互相妨碍。
By the complexity of a machine I mean the extent to which cross-connections between the various units obscure their logical interrelation. A machine is easier to repair if it consists of a number of units connected together in a simple way without cross-connections between them; it is also easier to construct since different people can work on the different units without getting in each other’s way.
至于重复,我认为每个人都更愿意在机器的特定部分中拥有一组五个相同的单元,而不是一组五个不同的单元。大多数人更愿意拥有六个相同的单位,而不是五个不同的单位。为了实现重复,人们应该准备在接受更多设备方面走多远,这是一个见仁见智的问题。此事可表述如下。假设认为机器的特定部分同样需要由一组n 个不同的单元组成,或者由一组kn个相同的单元组成,所有单元具有相似的尺寸。k的值是多少?我的猜想是k > 2。我应该说我正在考虑一台大约有 10 组单元的机器,并且n大约等于 10。
As regards repetition I think everyone would prefer to have in a particular part of the machine a group of five identical units rather than a group of five different units. Most people would prefer to have six identical units rather than five different units. How far one ought to be prepared to go in the direction of accepting a greater quantity of equipment in order to achieve repetition is a matter of opinion. The matter may be put as follows. Suppose that it is regarded as being equally desirable to have a particular part of the machine composed of a group of n different units, or composed of a group of kn identical units, all the units being of similar size. What is the value of k? My conjecture is that k > 2. I should say that I am thinking of a machine which has about 10 groups of units and that n is approximately equal to 10.
我刚才的言论具有普遍适用性。我现在将尝试更加具体。如果构建一台并行机,那么在算术单元中就有一个很好的例子,即一台由重复多次的相同单元组成的设备。然而,这样的算术单元比串行机中的算术单元大得多。另一方面,我认为确实可以说并行机中的控制比串行机中的控制更简单。我在这里使用“控制”一词是在非常一般的意义上,包括不属于存储本身(即,它包括访问电路)或算术单元中的寄存器和加法器的所有内容。……
The remarks I have just made are of general application. I will now try to be more specific. If one builds a parallel machine one has a good example, in the arithmetical unit, of a piece of equipment consisting of identical units repeated many times. Such an arithmetical unit is, however, much larger than that in a serial machine. On the other hand I think it is true to say that the control in a parallel machine is simpler than in a serial machine. I am using the word control here in a very general sense to include everything that does not appertain to the store proper (i.e., it includes the access circuits) or to the registers and adders in the arithmetical unit.…
因此,我们想到一个由多个标准单元组成的算术单元,每个标准单元包含四个触发器(一个属于四个寄存器中的每一个)以及一个加法器。将提供门,以便在必要时通过加法器将数字从一个寄存器传输到另一个寄存器。这些传输将通过对从算术单元出现的一组电线中的一根或多根脉冲进行影响。
We are thus led to think of an arithmetical unit composed of a number of standard units each containing four flip-flops (one belonging to each of four registers) together with an adder. Gates would be provided to make possible the transfer of numbers from one register to another, through the adder when necessary. These transfers would be effected by pulsing one or more of a set of wires emerging from the arithmetical unit.
机器的控制中还需要有寄存器。这些在曼彻斯特机器和 EDSAC 中分别给出的名称如下:
It is also necessary to have registers in the control of a machine. These, with the names given to them respectively in the Manchester machine and in the EDSAC, are as follows:
用于保存下一个要执行的命令(控制或顺序控制槽)的地址的寄存器。
Register for holding the address of the next order due to be executed (control, or sequence control tank).
保存当前正在执行的指令的寄存器(当前指令寄存器,或指令槽)。
Register holding order at present being executed (current instruction register, or order tank).
用于计算乘法或移位操作中步数的寄存器(曼彻斯特机上的快速乘法器、EDSAC 中的定时控制槽不需要)。
Register for counting the number of steps in a multiplication or shifting operation (not needed with the fast multiplier on the Manchester machine, timing control tank in the EDSAC).
另外曼彻斯特机还有多个B寄存器。
In addition the Manchester machine has a number of B registers.
如果一个 B 寄存器被认为是足够的,我们正在考虑的并行机可以使用与算术寄存器相同的单元(包含 4 个触发器和 1 个加法器)作为控制寄存器。通过这种方式,可以实现极端程度的重复。
If one B register is considered to be sufficient the parallel machine we are considering can use the same unit (containing 4 flip-flops and 1 adder) for the control registers as for arithmetical registers. In this way an extreme degree of repetition can be achieved.
仍然需要考虑适当的控制,即机器中提供脉冲以操作与算术和控制寄存器相关的门的部分。机器这部分的设计者通常以特别的方式进行,绘制框图,直到他看到一种满足他的要求并且看起来相当经济的布置。我想建议一种方法,使控制变得系统化,从而降低复杂性。
It remains to consider the control proper, that is, the part of the machine which supplies the pulses for operating the gates associated with the arithmetical and control registers. The designer of this part of a machine usually proceeds in an ad hoc manner, drawing block diagrams until he sees an arrangement which satisfies his requirements and appears to be reasonably economical. I would like to suggest a way in which the control can be made systematic, and therefore less complex.
机器的指令代码中的指令所要求的每个操作都涉及一系列步骤,这些步骤可以包括从存储到控制或算术寄存器的传输,或者反之亦然,以及从一个寄存器到另一个寄存器的传输。这些步骤中的每一步都是通过对与控制和算术寄存器相关的某些线路施加脉冲来实现的,我将其称为“微操作”。因此,每个真正的机器操作都是由一系列微操作的“微程序”组成。
Each operation called for by an order in the order code of the machine involves a sequence of steps which may include transfers from the store to control or arithmetical registers, or vice versa, and transfers from one register to another. Each of these steps is achieved by pulsing certain of the wires associated with the control and arithmetical registers, and I will refer to it as a “micro-operation.” Each true machine operation is thus made up of a sequence of “micro-programme” of micro-operations.
图 15.1显示了执行微操作的脉冲的产生方式。启动微操作的定时脉冲进入解码树,并根据寄存器R上设置的编号路由到输出之一。它进入整流器矩阵A,并根据整流器的布置在该矩阵的某些输出线上产生脉冲。这些脉冲操作与控制和算术寄存器相关的门,并导致执行正确的微操作。来自解码树的脉冲也传递到矩阵B并在该矩阵的某些输出线上产生脉冲。这些脉冲通过短延迟线传导到寄存器R并导致其上设置的数字发生变化。结果是进入解码树的下一个启动脉冲将从不同的出口出现,并因此导致执行不同的微操作。因此可以看出,矩阵A中的每一行整流器对应于执行机器操作所需的序列中的微指令之一。
Figure 15.1 shows the way in which pulses for performing the micro-operations may be generated. The timing pulse which initiates a micro-operation enters the decoding tree and is routed to one of the outputs according to the number set on the register R. It passes into the rectifier matrix A and gives rise to pulses on certain of the output wires of this matrix according to the arrangement of the rectifiers. These pulses operate the gates associated with the control and arithmetical registers, and cause the correct micro-operation to be performed. The pulse from the decoding tree also passes into matrix B and gives rise to pulses on certain of the output wires of this matrix. These pulses are conducted, via a short delay line, to the register R and cause the number set up on it to be changed. The result is that the next initiating pulse to enter the decoding tree will emerge from a different outlet and will consequently cause a different micro-operation to be performed. It will thus be seen that each row of rectifiers in matrix A corresponds to one of the micro-orders in the sequence required to perform a machine operation.
所描述的系统将使得只能执行固定的操作周期。通过使某些微指令成为有条件的,可以极大地扩展其效用,因为它们后面跟着根据机器状态的两个替代微指令之一。这可以通过在进入矩阵B之前使解码树的输出成为分支来完成。分支处的脉冲方向由来自机器另一部分的电线上的电势控制;例如,它可能来自累加器的符号触发器。图 15.1中矩阵A的底行对应于条件微阶。矩阵A包含微指令序列,用于执行机器指令代码中的所有基本操作。执行特定操作所需的只是“微控制”应按适当的顺序切换到第一个微指令。这是通过在寄存器R的前四个或五个触发器上设置命令的功能数字,在其他触发器上设置零来完成的。
The system as described would enable a fixed cycle of operations only to be performed. Its utility can be greatly extended by making some of the micro-orders conditional in the sense that they are followed by one of two alternative micro-orders according to the state of the machine. This can be done by making the output of the decoding tree branch before it enters matrix B. The direction the pulse takes at the branch is controlled by the potential on a wire coming from another part of the machine; for example, it might come from the sign flip-flop of the accumulator. The bottom row of matrix A in Figure 15.1 corresponds to a conditional micro-order. The matrix A contains sequences of micro-orders for performing all the basic operations in the order code of the machine. All that is necessary to perform a particular operation is that “micro-control” shall be switched to the first micro-order in the appropriate sequence. This is done by causing the function digits of the order to be set up on the first four or five flip-flops of the register R, zero being set on the others.
以这种方式设计的控制系统在结构上当然是非常合乎逻辑的,但是可能会提出两个在含义上略有矛盾的评论。首先,可以说这种安排并没有什么新意,因为它利用了触发器、门和混合二极管,而这些都是构建任何控制的元件。对于这种批评,我表示同意。事实上,现在存在或正在建造的各种机器的控制无疑可以以某种与图 15.1非常相似的方式绘制。另一个反对意见是,该方案在设备方面显得相当奢侈。我认为这是不正确的,特别是如果允许某些偏离图 15.1的精确形式的话。我认为,从逻辑布局开始,人们很可能会得到既合乎逻辑又经济的最终安排。此外,人们能够在每个阶段看到为了实现经济而以逻辑布局的方式牺牲了什么,反之亦然。为了了解所需的微指令数量,我为一台简单的机器构建了一个微程序,具有以下指令:加、减、乘(两个指令,一个用于乘数,一个用于被乘数),右移和左移(任意位数),从累加器传输到存储,根据累加器中数字的符号进行条件操作,根据 B 寄存器中数字的符号进行条件操作(一个 B 寄存器是假定),从存储传输到 B 寄存器、输入和输出。微程序还提供从商店初步提取订单的功能(EDSAC 术语中的第一阶段)。执行所有这些操作只需要 40 个微指令。
A control system designed in this way is certainly very logical in structure but two comments, slightly contradictory in their implications, might be made. In the first place it might be said that there is nothing very new about the arrangement since it makes use of flip-flops, gates, and mixing diodes which are the elements out of which any control is built. With this criticism I would agree. In fact, the controls of various machines now in existence or being constructed could no doubt be drawn in some way closely resembling Figure 15.1. The other objection is that the scheme appears to be rather extravagant in equipment. This I think is not true, particularly if some departures from the precise form of Figure 15.1 are allowed. I think that by starting with a logical layout one is likely to arrive at a final arrangement which is both logical and economical. Moreover, one is able to see at each stage what one is sacrificing in the way of logical layout in order to achieve economy and vice versa. In order to get some idea of the number of micro-orders required I have constructed a micro-programme for a simple machine with the following orders: add, subtract, multiply (two orders, one for the multiplier, one for the multiplicand), right and left shift (any number of places), transfer from the accumulator to the store, conditional operation depending on the sign of the number in the accumulator, conditional operation depending on the sign of the number in the B register (one B register is assumed), transfer from the store to the B register, input, and output. The micro-programme also provides for the preliminary extraction of the order from the store (Stage 1 in EDSAC terminology). Only 40 micro-orders are required to perform all these operations.
制定微程序所涉及的考虑因素与制定普通程序所涉及的考虑因素类似。因此,控制的最终细节是通过系统过程而不是基于使用框图的通常的临时程序来确定的。当然,需要健全的工程来设计解码树和矩阵,通过在矩阵中适当地布置整流器,可以将其用于任何所需的微程序。这种控制设计方法的一个重要优点是,直到机器构造的后期阶段才需要最终决定订单代码;甚至可以在机器投入运行后通过重新连接矩阵来更改它。
The considerations involved in drawing-up a micro-programme resemble those involved in drawing-up an ordinary programme. The final details of the control are thus settled by a systematic process instead of by the usual ad hoc procedures based on the use of block diagrams. Of course, sound engineering would be necessary to produce designs for the decoding tree and the matrices which could be used for any desired micro-programme by arranging the rectifiers suitably in the matrices. One important advantage of this method of designing the control is that the order code need not be decided on finally until a late stage in the construction of the machine; it would even be possible to change it after the machine had been put into operation simply by rewiring the matrices.
经 Elsevier 许可,转载自 Wilkes (1981)。
Reprinted from Wilkes (1981), with permission from Elsevier.
Grace Murray Hopper(1906-1992)有着非凡的职业生涯,今天以全国女性计算机科学家会议(一年一度的 Grace Murray Hopper 会议)和美国海军军舰(USS Hopper 号)的名义获得了适当的荣誉。 ,导弹驱逐舰)。1934 年,她在耶鲁大学获得数学博士学位,并在瓦萨学院担任数学教授,直到第二次世界大战开始,她在那里获得了本科学位。1940年左右开始,她尝试加入海军,但因年龄太大而被拒绝。1943 年,尽管体重不足 120 磅,她最终还是被海军预备役录取。她被分配到哈佛大学霍华德·艾肯的计算实验室,在那里她对 Mark I 及其后继机器 Mark II 进行了编程。她曾将一只导致继电器故障的飞蛾记录在一本日志中。她讽刺性的表述“第一个发现错误的实际案例”承认她没有创造“错误”这个术语,该术语已经是机器错误的工程术语。更重要的是,她和艾肯一样认识到,计算的未来将既涉及商业数据处理,也涉及科学计算。
Grace Murray Hopper (1906–1992) had a remarkable professional career, duly honored today both in the name of the national meeting of women computer scientists (the annual Grace Murray Hopper Conference) and in the name of a U.S. Naval warship (the USS Hopper, a guided missile destroyer). In 1934 she received a PhD in mathematics from Yale, and until the start of World War II was a mathematics professor at Vassar, where she had earned her undergraduate degree. Starting around 1940 she tried to join the Navy, but was rejected as too old. In 1943 she finally was accepted into the Naval Reserve in spite of being underweight at 120 pounds. She was assigned to Howard Aiken’s Computation Lab at Harvard, where she programmed the Mark I and its successor machine the Mark II. She famously taped into a log book a moth that had caused a relay to malfunction. Her ironic notation “First actual case of a bug being found,” acknowledges that she didn’t coin the term “bug,” which was already engineering jargon for a machine error. More importantly, she recognized, as Aiken did, that the future of computing would be as much in business data processing as in scientific calculation.
1949 年,霍珀加入了埃克特-莫奇利计算机公司,该公司正在将摩尔学院的设计商业化(见第 90 页)。该公司被雷明顿·兰德(Remington Rand)收购,该公司是本文列出的霍珀的附属公司;该公司的机器被称为 U NIVAC。(经过进一步并购,公司更名为UNISYS。)
In 1949 Hopper joined the Eckert–Mauchly Computer Corporation, which was commercializing the design that had come out of the Moore School (see page 90). That company was bought by Remington Rand, Hopper’s affiliation as listed in this paper; the company’s machine was called the UNIVAC. (After further mergers and acquisitions the company became UNISYS.)
在 20 世纪 40 年代,没有更高级的语言,没有解析器,甚至几乎没有可用的机器代码符号表示。没有编译器,也没有调试工具。将算法翻译成运行代码的整个过程都是手工完成的,使用纸和铅笔,直到输入程序的最后一步。霍珀在这里标志着计算机辅助程序员进程的开始。这篇论文不仅提到了“编译器”(包括今天所谓的链接加载器和代码重定位),还提到了符号编程(她预计计算机将计算符号导数);代码优化权衡;全局程序分析(“扫描一次计算机信息以检查其结构”);宏汇编语言(此处称为“多地址代码”);子程序的正式规范;分层程序结构;将注意力从战时非常重要的数值算法转向商业应用;并认识到从长远来看软件的成本将大大超过硬件的成本。
In the 1940s there were no higher level languages, no parsers, and hardly even a usable symbolic representation for machine code. There were no compilers and no debugging tools. The entire process of translating an algorithm into running code was done by hand, using paper and pencil until the very last step of inputting the program. Hopper here signals the beginning of the process of computers assisting programmers. The paper includes allusions not only to “compilers” (including what today would be called linking loaders and code relocation) but symbolic programming (she anticipates that computers will compute symbolic derivatives); code optimization trade-offs; global program analysis (“sweeping the computer information once to examine its structure”); macro-assembly language (referred to here as a “multiple-address code”); formal specifications of subroutines; hierarchical program structure; the redirection of attention from the kinds of numerical algorithms that had been so important during wartime toward commercial applications; and the recognition that the cost of software would in the long run vastly exceed the cost of hardware.
所有这些要素都在本文中,但没有直接描述。这篇论文被精心拟人化。在扩展和泛化的每个阶段,很难判断人与机器之间的界限在哪里,因为霍珀预计这条界限会随着时间的推移而发生变化——计算机的“教育”将会进步。几年之内,她就成为雷明顿兰德公司“自动编程”的负责人,领导了早期编程语言 ARITH-MATIC 和 MATH-MATIC(Ash 等人,1957 年)的开发,这两种语言似乎大致对应于本文的A类例程和B类例程。
All these elements are in this paper, but they are not described straightforwardly. The paper is elaborately anthropomorphic. It is hard to tell at each stage of expansion and generalization where the line between human and machine is supposed to lie, because Hopper is anticipating that the line would shift over time—the computer’s “education” would advance. Within a couple of years she had become the head of “automatic programming” at Remington Rand, leading the development of the early programming languages ARITH-MATIC and MATH-MATIC (Ash et al., 1957), which seem to correspond roughly to the Type A and Type B routines of this paper.
这些语言是 U NIVAC特有的。但霍珀继续成为 CO OBOL(一种面向商业的通用语言)开发和标准化背后的坚定力量,面对人们对计算机能否处理用类似英语词汇解释的数据操作的怀疑,以及所取得的成果使用此类语言对编程效率的影响将远远超过执行速度的损失。她一生都被誉为顽固的逆向思想家,用生动的比喻来表达自己的观点。她在公开演讲中以一英尺长的铜线形式传递“纳秒”,并在办公桌后面展示了一个带有镜像表盘的倒转时钟,以表明惯例是用来打破的。 。
Those languages were specific to the UNIVAC. But Hopper went on to be the determined force behind the development and standardization of COBOL (a COmmon Business Oriented Language), in the face of skepticism that computers could be made to handle data manipulations paraphrased in English-like vocabulary, and that the gains in programming efficiency from using such languages would far outweigh any loss in execution speed. She had a reputation as a stubborn, contrarian thinker throughout her life, using vivid metaphors to make her points. She brought “nanoseconds” to pass out in public talks in the form of foot-long segments of copper wire, and displayed a backwards-running clock with a mirror-image dial behind her desk to make the point that conventions are made to be broken.
霍珀在海军中有着杰出的职业生涯,获得了海军少将的军衔。我对她有一份个人记忆,这段记忆表明了她职业生涯每一步都遇到的障碍。20 世纪 70 年代,当她身着制服回到哈佛时,距离她作为新预备役军人的马克一号在那里工作了几十年,她的心情很糟糕。她飞往波士顿的飞机上,机组人员对这位身穿海军上将制服、身材娇小、头发灰白的女士非常尊重——就像一位退休空姐一样!
Hopper had a distinguished career in the Navy, attaining the rank of Rear Admiral. I have a personal memory of her, one that suggests the obstacles that stood in the way of her career at every step. When she returned to Harvard in uniform in the 1970s, decades after her Mark I work there as a novice reservist, she was in a foul mood. The cabin crew of the airplane on which she had flown to Boston had treated the diminutive, gray-haired lady in the Navy admiral’s uniform with great deference—as a retired stewardess!
虽然物化是新事物,但机械化数学思维的想法并不新鲜。它的谱系始于算盘,一直延伸到帕斯卡、莱布尼茨和巴贝奇。更直接地说,这里提出的想法来自哈佛大学的 Howard H. Aiken、Eckert-Mauchly 的 John W. Mauchly 和剑桥大学的 MV Wilkes。1946 年,Aiken 提出了 Mark I 手册中描述的例程库的想法,以及 Mark III 编码机中体现的概念,Mauchly 提出了“短序编码”的基本原理以及建议、批评,以及不懈地耐心倾听这些当前的尝试;威尔克斯(Wilkes)提供了一本关于这个主题的书,这是所有书中最大的帮助。对于本文中包含的他们的想法,我最诚挚地表示感谢和赞赏。
WHILE the materialization is new, the idea of mechanizing mathematical thinking is not new. Its lineage starts with the abacus and descends through Pascal, Leibniz, and Babbage. More immediately, the ideas here presented originate from Howard H. Aiken of Harvard University, John W. Mauchly of Eckert–Mauchly and M. V. Wilkes of the University of Cambridge. From Aiken came, in 1946, the idea of a library of routines described in the Mark I manual, and the concepts embodied in the Mark III coding machine, from Mauchly, the basic principles of the “short-order code” and suggestions, criticisms, and untiring patience in listening to these present attempts; from Wilkes, the greatest help of all, a book on the subject. For those of their ideas which are included herein, I most earnestly express my debt and my appreciation.
首先,图 16.1表示操作所需元素的配置:操作的输入;控制,即使它们只是启动和停止;预先准备好提供给操作的工具;产品的产量,反过来又可能成为另一个操作的输入。这是生产线的基本要素;原材料的输入,由人类控制,可能通过仪器;随机床一起提供;该工厂生产一辆汽车、一条铁路或一罐西红柿。
To start at the beginning, Figure 16.1 represents the configuration of the elements required by an operation: input to the operations; controls, even if they be only start and stop; previously prepared tools supplied to the operation; and output of products, which may, in turn, become the input of another operation. This is the basic element of a production line; input of raw materials, controlled by human beings, possibly through instruments; supplied with machine tools; the operation produces an automobile, a rail, or a can of tomatoes.
图 16.1: 操作
Figure 16.1: An operation
武装部队、政府和工业界不仅有兴趣创建新的行动以产生新的成果,而且有兴趣提高旧行动的效率。图 16.2是一个非常古老的运算,是数学问题的解决方案。它适合操作配置:数学数据的输入;由数学家控制;提供记忆、公式、表格、铅笔和纸;大脑进行算术并产生结果。
The armed services, government, and industry are interested not only in creating new operations to produce new results, but also increasing the efficiency of old operations. A very old operation, Figure 16.2, is the solution of a mathematical problem. It fits the operational configuration: input of mathematical data; control by the mathematician; supplied with memory, formulas, tables, pencil, and paper; the brain carries on the arithmetic, and produces results.
图 16.2: 问题的解决方案
Figure 16.2: Solution of problem
目前的目标是尽可能用电子数字计算机取代人脑。这种计算机本身适合这种配置,如图16.3所示。(如果您允许,我将使用U NIVAC作为电子数字计算机的代名词;主要是因为我这样认为,但也因为它很方便。)
It is the current aim to replace, as far as possible, the human brain by an electronic digital computer. That such computers themselves fit this configuration may be seen in Figure 16.3. (With your permission, I shall use UNIVAC as synonymous with electronic digital computer; primarily because I think that way, but also because it is convenient.)
图 16.3: U NIVAC系统。[编辑:UNITYPER 是打字机输入设备,UNISERVO 是磁带驱动器。]
Figure 16.3: The UNIVAC system. [EDITOR: UNITYPER was a typewriter input device, UNISERVO a magnetic tape drive.]
将人类和电子计算机的配置加在一起,图 16.4显示了两个操作级别的问题的解决方案。数学家不再需要承担算术琐事,他已成为一名程序员,并将这项职责分配给了UNNIVAC。程序员被提供了一个“代码”,他可以将指令翻译成计算机。U NIVAC的工程师设计的“标准知识”由基本算术和逻辑组成。
Adding together the configurations of the human being and the electronic computer, Figure 16.4 shows the solution of a problem in two levels of operation. The arithmetical chore has been removed from the mathematician, who has become a programmer, and this duty assigned to the UNIVAC. The programmer has been supplied with a “code” into which he translates his instructions to the computer. The “standard knowledge” designed into the UNIVAC by its engineers, consists of its elementary arithmetic and logic.
图 16.4: 问题的解决方案。
Figure 16.4: Solution of a problem.
这种情况一直保持不变,直到发明程序的新鲜感逐渐消失并退化为编写和检查程序的枯燥劳动。现在,这项责任已成为人类大脑的强加义务。此外,随着计算机的付费,编程成本和消耗的时间也会引起副总裁和项目总监的注意。常识表明需要插入第三级操作,如图 16.5 所示。
This situation remains static until the novelty of inventing programs wears off and degenerates into the dull labor of writing and checking programs. This duty now looms as an imposition on the human brain. Also, with the computer paid for, the cost of programming and the time consumed, comes to the notice of vice-presidents and project directors. Common sense dictates the insertion of a third level of operation, Figure 16.5.
图 16.5: 编译例程和子例程。
Figure 16.5: Compiling routines and subroutines.
程序员可能会回归数学家的身份。他获得了一个子程序目录。他不再需要可用的公式或初等函数表。他甚至不需要知道计算机使用的特定指令代码。他只需要能够使用目录向计算机提供有关他的问题的信息。U NIVAC根据数学家提供的信息,在“A 型编译例程”的控制下,使用子例程和自己的指令代码生成程序。该程序反过来指导 U NIVAC对输入数据进行计算,并生成所需的结果。所消耗的时间和错误来源已大大减少。如果图书馆藏书充足,编程时间就会缩短到几个小时,而不是几周。该程序不再受到转录错误或未经测试的例程的影响。
The programmer may return to being a mathematician. He is supplied with a catalogue of subroutines. No longer does he need to have available formulas or tables of elementary functions. He does not even need to know the particular instruction code used by the computer. He needs only to be able to use the catalogue to supply information to the computer about his problem. The UNIVAC, on the basis of the information supplied by the mathematician, under the control of a “compiling routine of type A,” using subroutines and its own instruction code, produces a program. This program, in turn directs the UNIVAC through the computation on the input data and the desired results are produced. A major reduction in time consumed and in sources of error has been made. If the library is well-stocked, programming has been reduced to a matter of hours, rather than weeks. The program is no longer subject either to errors of transcription or of untested routines.
计算机信息、目录、编译例程和子例程的规范将在框图中添加另一层后给出。如图16.5所示,数学家仍然必须执行所有数学运算,归入 U NIVAC编程和计算操作。然而,数学家提供的计算机信息不再处理数值本身。它以符号形式处理变量和常量以及对它们的操作。现在可以插入第四级操作,如图 16.6 所示。例如,假设数学家希望计算一个函数及其前n个导数。他将定义函数本身的信息发送给 U NIVAC。在“B 型编译例程”(在本例中为微分器)的控制下,使用任务例程,U NIVAC提供对函数及其导数的计算进行编程所需的信息。U NIVAC根据函数的公式推导出连续导数的公式。该信息在 A 类编译例程下处理后产生一个程序来指导计算。
Specifications for computer information, a catalogue, compiling routines, and subroutines will be given after adding another level to the block diagram. As Figure 16.5 stands the mathematician must still perform all mathematical operations, relegating to the UNIVAC programming and computational operations. However, the computer information delivered by the mathematician no longer deals with numerical quantities as such. It treats of variables and constants in symbolic form together with operations upon them. The insertion of a fourth level of operation is now possible, Figure 16.6. Suppose, for example, the mathematician wishes to evaluate a function and its first n derivatives. He sends the information defining the function itself to the UNIVAC. Under control of a “compiling routine of type B,” in this case a differentiator, using task routines, the UNIVAC delivers the information necessary to program the computation of the function and its derivatives. From the formula for the function, the UNIVAC derives the formulas of the successive derivatives. This information processed under a compiling routine of Type A yields a program to direct the computation.
图 16.6: 编译 B 类和任务例程。
Figure 16.6: Compiling Type B and task routines.
扩展使得这个过程看起来又长又复杂。它不是。再次简化为两部分系统:数学家和计算机,图 16.7展示了计算系统的更准确的图景。
Expansion makes this procedure look, and seem, long and complicated. It is not. Reducing again to the two-component system, the mathematician and the computer, Figure 16.7 presents a more accurate picture of the computing system.
图 16.7: 计算系统。
Figure 16.7: Computing system.
假设代码、程序、输入数据和结果是熟悉的术语,则仍然需要定义和指定该系统可接受的信息和例程的形式。其中包括目录;计算机信息;子程序;编译例程;A型和B型;和任务例程。
Presuming that code, program, input data, and results are familiar terms, it remains to define and specify the forms of information and routines acceptable to this system. These include catalogue; computer information; subroutine; compiling routines; type A and B; and task routines.
一旦明确了使用子例程的目的,就会出现两种方法。在其中一种情况下,程序引用立即可用的子例程,使用它并继续计算。对于子程序数量有限的情况,该方法是可行且有用的。这样的系统是由计算分析实验室[编辑:埃克特-莫赫利计算机公司]的工作人员以“短序代码”的昵称开发的。
As soon as the purpose is stated to make use of subroutines, two methods arise. In one, the program refers to an immediately available subroutine, uses it, and continues computation. For a limited number of subroutines, this method is feasible and useful. Such a system has been developed under the nickname of the “short-order code” by members of the staff of the Computational Analysis Laboratory [EDITOR: at Eckert–Mauchly Computer Corporation].
第二种方法不仅查找子例程,而且将其适当调整后翻译成程序。因此,完成的程序可以在任何需要的时候作为一个单元运行,并且本身可以作为更高级的子例程放置在库中。
The second method not only looks up the subroutine, but translates it, properly adjusted, into a program. Thus, the completed program may be run as a unit whenever desired, and may itself be placed in the library as a more advanced subroutine.
每个问题都必须简化为可用子例程的级别。假设一个简单的问题,使用基本子例程计算y = e − x 2 sin cx 。公式的每一步都属于操作模式,图16.8;即,u = x 2;U = e − u;v = cx ; V = 正弦v;y =紫外线。然而,如图16.9所示,该信息尚未充分标准化,无法被编译例程接受。必须考虑几个问题并定义程序。[编辑:霍珀在这里追随洛夫莱斯的脚步——与第 14 页的图 3.1进行比较。]
Each problem must be reduced to the level of the available subroutines. Suppose a simple problem, to compute y = e−x2 sin cx, using elementary subroutines. Each step of the formula falls into the operational pattern, Figure 16.8; that is, u = x2; U = e−u; v = cx; V = sin v; y = UV. As presented in Figure 16.9, however, this information is not yet sufficiently standardized to be acceptable to a compiling routine. Several problems must be considered and procedures defined. [EDITOR: Hopper here follows the footsteps of Lovelace—compare to Figure 3.1 on page 14.]
图 16.8: 操作
Figure 16.8: Operation
图 16.9: 示例 [编辑器:aaL=“添加到限制”(Ash 等人,1957 年,第 86 页)。]
Figure 16.9: Example [EDITOR: aaL = “add to a limit” (Ash et al., 1957, page 86).]
这些操作按正常顺序编号,并且该编号成为计算机信息的一部分。因此,当需要改变正常顺序时,很容易识别替代目的地。编译例程将这些操作数翻译成编码程序中的指令。出现两种基本情况,备用目的地要么在所考虑的操作之前,要么在其之后,绕过多个中间操作。在这两种情况下,只需要让编译例程记住它把每个子例程放在哪里或者已经指示了将控制权转移到操作k 。无论如何,数学家只需要声明“进行操作k ”,编译例程就会完成其余的工作。
The operations are numbered in normal sequence and this number becomes part of the computer information. Thus when it is desired to change the normal sequence, the alternate destination is readily identified. The compiling routine translates these operation numbers into instructions in the coded program. Two fundamental situations arise, the alternate destination either precedes the operation under consideration or follows it, by-passing several intermediate operations. In both cases, it is necessary only to have the compiling routine remember where it has placed each subroutine or that a transfer of control to operation k has been indicated. In any event the mathematician need only state, “go to operation k,” and the compiling routine does the rest.
接下来要关心的是用于参数和结果以及运算的符号。一位数学家可能会写成y = e − x 2 sin cx,另一位数学家可能写成u = e − v 2 sin gv。事实证明,显而易见的解决方案是最好的。列出论据和结果并编号。(这相当于将所有常量和变量写为x i。)顺序并不重要,因此可以在末尾添加忘记的量(图 16.10)。
The symbols to be used for the arguments and results, as well as for the operations, are of next concern. One mathematician might write y = e−x2 sin cx, and another u = e−v2 sin gv. The obvious solution proves best. Make a list of arguments and results and number them. (This amounts to writing all constants and variables as xi.) The order is immaterial, so that forgotten quantities can be added at the end (Figure 16.10).
Figure 16.10: Variable table for Figure 16.9
使用“调用号”系统作为操作和子程序的符号。这些字母字符代表子例程的类别。按照威尔克斯博士的例子,这些符号部分是语音的;那是,A=算术,t= 三角函数,并且X= 指数;航空航天中心= 算术,乘以一个常数;xe= e − U;ts0= 三角函数,正弦函数。与呼叫号码一起放置,n,F, 或者s,表示正常、浮动或指定(固定)小数点。其他字母和数字表示角度、复数等的弧度或度数。这些调用号与参数、控制和结果的说明顺序一起列在目录中。……
As symbols for the operations and subroutines, a system of “call-numbers” is used. These alphabetic characters represent the class of subroutines. Following Dr. Wilkes’ example, these symbols are partially phonetic; that is, a = arithmetic, t = trigonometric, and x = exponential; amc = arithmetic, multiplication by a constant; x-e = e−U; ts0 = trigonometric, sine. Placed with the call-numbers, n, f, or s, indicates normal, floating, or stated (fixed) decimal point. Other letters and digits indicate radians or degrees for angles, complex numbers, etc. These call-numbers are listed in the catalogue together with the order in which arguments, controls, and results are to be stated. …
库中的每个子例程都以相对于其入口线(被视为 001)的编码来表示。一般来说,它们的编程和编码是为了实现最大精度和最短计算时间。他们可能会在自己内部存储自己特有的常数。他们还可能利用每个程序读入的某些“永久常量”。这些永久常量占据内存的保留部分,并由字母内存位置调用,这是目前 U NIVAC特有的一种技巧。因此,在程序中定位子例程的过程中,这些地址不会被修改。它们包括 1/2 π、π / 4、log 10 e、± 0,.2,.5 等量。
Each subroutine in the library is expressed in coding relative to its entrance line considered as 001. They are, in general, programmed and coded for maximum accuracy and minimum computing time. They may store within themselves constants peculiar to themselves. They may also make use of certain “permanent constants” read in with every program. These permanent constants occupy a reserved section of the memory and are called for by alphabetic memory locations, a trick, at present peculiar to UNIVAC. Thus, these addresses are not modified in the course of positioning the subroutine in a program. They include such quantities as 1/2π, π/4, log10 e, ± 0,.2,.5, and the like.
每个子程序前面都有某些信息,匹配和补充数学家提供的信息:
Each subroutine is preceded by certain information, matching and supplementing that supplied by the mathematician:
1. 电话号码;
1. call-number;
2. 参数,子程序中参数的目的地,以子程序的相对编码表示;
2. arguments, the destination of the arguments within the subroutine, expressed in the relative coding of the subroutines;
3. 不可修改的指示符定位嵌入在子程序中的不可更改的常量;
3. non-modification indicators locating constants embedded in the subroutine which are not to be altered;
4. 结果,结果在子程序中的位置,以相对编码表示。
4. results, the positions of the results within the subroutine, expressed in relative coding.
每个子程序都按标准模式排列。
Each subroutine is arranged in a standard pattern.
入口行——子程序的第一行是它的入口行,因此在相对编码中它是第一行。它是转移到程序的子例程的第一行,并且包含将控制转移到第一操作行的指令。
Entrance line – The first line of a subroutine is its entrance line, thus in relative coding it is number one. It is the first line of the subroutine transferred to a program, and it contains an instruction transferring control to the first action line.
退出行– 子例程的第二行是其正常退出行。它包含一条将控制权转移到子例程最后一行之后的行的指令。除非需要交替的控制转移,否则子程序的所有出口都参考正常出口线。备用退出线,涉及从通常顺序转移控制权,按照目录中列出的预定顺序遵循正常退出线。
Exit lines – The second line of a subroutine is its normal exit line. This contains an instruction transferring control to the line following the last line of the subroutine. Unless an alternate transfer of control is desired, all exits from the subroutine are referred to the normal exit line. Alternate exit lines, involving transfers of control from the usual sequence follow the normal exit line in a predetermined order as listed in the catalogue.
参数– 退出行后面是为按预定顺序排列的参数保留的空间。
Arguments – The exit lines are followed by spaces reserved for the arguments arranged in predetermined order.
结果– 结果也按指定顺序位于参数后面。
Results – The results, also in specified order, follow the arguments.
常量——如果可能的话,结果后面跟随子程序特有的任意常量。当子程序由其他子程序组合而成时,常量组也可能嵌入在子程序中。这些由非修改信息来处理。
Constants – The results are followed, when possible, by any arbitrary constants peculiar to the subroutine. When the subroutine has been compounded from other subroutines groups of constants may also appear embedded in the subroutine. These are cared for by the non-modification information.
第一个操作行出现在子例程中的下一个位置。它在相关编码中的位置由入口线定义。该行之前不能有任何指令行。
The first action line appears next in the subroutine. Its position in the relative coding is defined by the entrance line. No instruction line may precede this line.
分配给入口线和出口线、参数、结果和常量的顺序是任意的。这很方便。所需要的只是建立一个序列并且计算机识别该序列。
The sequence assigned to the entrance and exit lines, arguments, results, and constants is arbitrary. It is convenient. All that is required is that a sequence be established and that the computer recognize this sequence.
为了方便操作,将一定数量的基本子程序组合成一个子库。...随着子例程的添加来扩展库,它变得更加有用,并且编程时间进一步减少。
For convenience in manipulation, a certain number of elementary subroutines have been combined to form a sub-library. …As subroutines are added to extend the library, it becomes more useful and programming time is further reduced.
事实上,有一天,基本的子程序很少被使用,计算机信息将只包含七八个项目来调用强大的子程序。
Indeed, the day may come when the elementary subroutines are rarely used and the computer information will contain but seven or eight items calling into play powerful subroutines.
没有经验的程序员没有必要也不建议篡改子程序中的编码。它通常是使用经验丰富的程序员已知的每一种技巧和设备进行最小延迟编码。已在电脑上运行测试。……
It is not necessary, nor is it advisable that the inexperienced programmer tamper with the coding within a subroutine. It is usually minimum latency coding using every trick and device known to the experienced programmer. It has been tested by operation on the computer. …
A 型编译例程旨在根据数学家或计算机提供的信息选择和排列子例程。基本上,只有一种 A 型例程。然而,由于 U NIVAC代码包含传输两个相邻的指令同时,设计了第二个编译例程来处理浮点小数、复数和双精度程序。对于数学家列出的每个运算,A 类例程将执行以下服务:
Compiling Routines Type A are designed to select and arrange subroutines according to information supplied by the mathematician or by the computer. Basically, there is but one Type A routine. However since the UNIVAC code contains instructions transferring two neighboring quantities simultaneously, a second compiling routine has been designed to care for floating decimal, complex number, and double precision programs. For each operation listed by the mathematician a type A routine will perform the following services:
1. 找到调用号所指示的子程序;
1. locate the subroutine indicated by the call-number;
2.从计算机和子程序信息结合其程序记录来制作指令并将其输入程序中,将参数从工作存储器传送到子程序;
2. from the computer and subroutine information combined with its record of the program fabricate and enter in the program the instructions transferring the arguments from working storage to the subroutines;
3、将入口线和正常出口线调整到程序中子程序的位置,并输入到程序中;
3. adjust the entrance and normal exit lines to the position of the subroutine in the program and enter them in the program;
4、根据编程器提供的控制信息调整备用退出线并输入到程序中(此过程涉及到记录的引用);
4. according to the control information supplied by the programmer adjust alternate exit lines and enter them in the program (this process involves reference to the record);
5、根据前面操作提供的控制信息调整辅助入口线并输入程序中;
5. according to the control information supplied with previous operations adjust auxiliary entrance lines and enter them in the program;
6、修改子程序指令中的所有地址,并将这些指令输入到程序中;
6. modify all addresses in the subroutine instructions and enter these instructions in the program;
7. 根据子程序提供的信息,保留子程序中嵌入的所有常量不变,并将它们传送到程序中;
7. according to information supplied by the subroutine, leave unaltered all constants embedded in the subroutine and transfer them to the program;
8. 从计算机和子程序信息中编制并将结果输入程序中的指令,将结果传输到[编辑器:文本丢失,可能是“工作存储”]。
8. from the computer and the subroutine information fabricate and enter in the program the instructions transferring the results to [EDITOR: text missing, perhaps “working storage”].
9. 维护并生成程序记录,包括每个子程序的调用号及其入口行在程序中的位置。
9. maintain and produce a record of the program including the call-number of each subroutine and the position of its entrance line in the program.
编译例程还包含有关UNIVAC特有的输入磁带、磁带库和程序磁带的某些指令。所有的计数操作,如临时存储器和程序空间的分配、输入和输出的控制等都由编译程序稳定地进行。坦率地说,编译例程是程序员,它执行生成最终程序所需的所有服务。
The compiling routines also contain certain instructions concerning input tapes, tape library, and program tapes peculiar to the UNIVAC. All counting operations such as allocation of temporary storage and program space, and control of input and output are carried on steadily by the compiling routine. Stated bluntly the compiling routine is the programmer and performs all those services necessary to the production of a finished program.
B型编译例程将对每一个操作,通过“任务例程”,用新的信息替换或补充给定的计算机信息。因此,编译例程B-1将为每个操作复制有关该操作的信息并调用相应的任务例程。任务例程将生成公式,并导出计算运算导数所需的信息。然后,编译例程 B-1 以适合提交给 A 类例程的形式记录该信息。
Compiling Routines of Type B will for each operation, by means of “task routines,” replace or supplement the given computer information with new information. Thus, compiling routine B-1 will, for each operation, copy the information concerning that operation and call in the corresponding task routine. The task routine will generate the formula, and derive the information necessary to compute the derivative of the operation. Compiling routine B-1 then records this information in a form suitable for submission to a Type A routine.
由于信息可能会重新提交给 B 类例程,因此显然,为了获得计算f ( x ) 及其前n 个导数的程序,只需给出定义f ( x )的信息和n的值。f ( x )导数的公式将通过重复应用 B-1 导出并通过 A 类例程进行编程。
Since information may be re-submitted to a type B routine, it is obvious that in order to obtain a program to compute f(x) and its first n derivatives only the information defining f(x) and the value of n need be given. The formulas for the derivatives of f(x) will be derived by repeated applications of B-1 and programmed by a type A routine.
在这里,可以最好地回答关于喜欢或厌恶子例程的问题。由于以这种方式使用子例程增强了计算机的能力,问题变得毫无意义,并转变为如何更快地生成更好的子例程的问题。然而,权衡使用子程序的优点和缺点,其中的优点是
It is here that the question can best be answered concerning a liking for or an aversion to subroutines. Since the use of subroutines in this fashion increases the abilities of the computer, the question becomes meaningless and transforms into a question of how to produce better subroutines faster. However, balancing the advantages and disadvantages of using subroutines, among the advantages are
1. 将内存分配、地址修改和转录等机械工作移交给 U NIVAC;
1. relegation of mechanical jobs such as memory allocation, address modification, and transcription to the UNIVAC;
2. 排除编程错误、转录错误等错误源;
2. removal of error sources such as programming errors and transcription errors;
3.节省编程时间;
3. conservation of programming time;
4、有一定的操作能力;
4. ability to operate on operations;
5.避免重复工作,因为每个程序又可以成为子程序。
5. duplication of effort is avoided, since each program in turn may become a subroutine.
只有两个缺点是显而易见的。由于标准化,在执行重复数据传输时会损失少量时间,而这可以在定制例程中消除。在基本负载问题中,这可能会变得严重。然而,即使在这种情况下,让 U NIVAC生成原始程序并在重新运行问题之前消除此类重复也是值得的。第二个缺点应该不会长期存在。事实上,如果所需的子例程不存在,则必须对其进行编程并将其添加到库中。在积累大量品种之前,这种情况最有可能发生在输入和输出编辑例程的情况下。这种情况也强调了在子程序的构造中需要最大程度的通用性。
Only two disadvantages are immediately evident. Because of standardization, a small amount of time is lost in performing duplicate data transfers which could be eliminated in a tailor-made routine. In base load problems, this could become serious. Even in this case however, it is worthwhile to have UNIVAC produce the original program and then eliminate such duplication before rerunning the problem. The second disadvantage should not long remain serious. It is the fact that, if a desired subroutine does not exist, it must be programmed and added to the library. This will be most likely to occur in the case of input and output editing routines until a large variety is accumulated. This situation also emphasizes the need for the greatest generality in the construction of subroutines.
可以指出该领域未来发展的几个方向。…更多的 A 型编译例程将被设计出来:那些处理商业程序而不是数学程序的程序;一些特殊用途的编译例程,例如将在进行过程中计算近似大小并相应地选择子例程的例程。编译例程必须知道每个子例程所需的平均时间,以便它们可以提供每个程序运行时间的估计。例如,如果在单个例程中同时调用sin θ和 cos θ ,则它们的同时计算速度会更快。这将涉及扫描计算机信息一次以检查其结构。
Several directions of future developments in this field can be pointed out. …More type A compiling routines will be devised: those handling commercial rather than mathematical programs; some special purpose compiling routines such as a routine which will compute approximate magnitudes as it proceeds and select sub-routines accordingly. Compiling routines must be informed of the average time required for each sub-routine so that they can supply estimates of running time with each program. For example, if both sin θ and cos θ are called for in a single routine, they will be computed more rapidly simultaneously. This will involve sweeping the computer information once to examine its structure.
B 类例程目前包括线性运算符。必须设计更多的B 类例程。几乎不可否认,类型 C 和 D 例程将被发现存在,增加了更高级别的操作。除了生成计算程序之外,还正在进行以代数形式生成 B 型例程开发的公式的工作。
Type B routines at present include linear operators. More type B routines must be designed. It can scarcely be denied that type C and D routines will be found to exist adding higher levels of operation. Work is already in progress to produce the formulas developed by type B routines in algebraic form in addition to producing their computational programs.
因此,通过将专业程序员(而不是数学家)视为计算机的一个组成部分,很明显,程序员的记忆以及他可以参考的所有信息和数据对于计算机来说都是可用的,只是将其翻译成合适的形式。语言。更明显的是,计算机完全有能力记住程序员提交给它的任何指令并根据指令采取行动。
Thus by considering the professional programmer (not the mathematician), as an integral part of the computer, it is evident that the memory of the programmer and all information and data to which he can refer is available to the computer subject only to translation into suitable language. And it is further evident that the computer is fully capable of remembering and acting upon any instructions once presented to it by the programmer.
凭借一些更高级主题的专业知识,U NIVAC目前拥有完全相当于大学二年级学生的扎实数学教育,并且不会忘记,不会犯错误。希望其本科课程能尽快完成,并被接受为研究生学位的候选人。
With some specialized knowledge of more advanced topics, UNIVAC at present has a well grounded mathematical education fully equivalent to that of a college sophomore, and it does not forget and does not make mistakes. It is hoped that its undergraduate course will be completed shortly and it will be accepted as a candidate for a graduate degree.
经计算机协会许可,由 Hopper (1952) 转载。
Reprinted from Hopper (1952), with permission from the Association for Computing Machinery.
数学家 Joseph Kruskal(1928-2010)的这篇论文标志着从运筹学(这一领域在 20 世纪 40 年代因解决各种商业、军事和后勤问题的需要而蓬勃发展)到计算机的道路上迈出了早期的一步科学,有些人将其定义为算法研究。这里提出的特定算法称为构造 A,现在称为 Kruskal 算法,通过生长森林、从最短到最长逐条添加边(不包括任何会创建循环的边)来找到无向图中的最小成本生成树。论文在第一句话中提到的 Borůvka (1926) 可能是第一个针对该问题发表的算法,其历史可以追溯到图论基本词汇正式化之前的时代。在那篇论文中,捷克数学家奥塔卡·博鲁夫卡 (Otakar Borůvka) 解决了为摩拉维亚(现属于捷克共和国的一个地区)寻找最高效电网的问题。
This paper by mathematician Joseph Kruskal (1928–2010) marks an early step along a path from operations research—a field that had flourished during the 1940s in response to the need to solve a variety of commercial, military, and logistical problems—to computer science, which some define as the study of algorithms. The particular algorithm presented here as Construction A, now known as Kruskal’s algorithm, finds the minimum-cost spanning tree in an undirected graph by growing a forest, adding edges one by one from shortest to longest, not including any edge that would create a cycle. Borůvka (1926), to which the paper refers in its first sentence, is probably the first published algorithm for this problem, dating from a time before the basic vocabulary of graph theory had been formalized. In that paper, the Czech mathematician Otakar Borůvka had solved the problem of finding the most efficient electrical network for Moravia, a region of what is now the Czech Republic.
对于当代读者来说,克鲁斯卡尔的演讲因其清晰性而引人注目——算法的描述和证明与最近的任何处理一样简洁而优雅——并且完全没有任何分析。Kruskal 将他的算法描述为“实用”,但没有暗示这在运行时或任何其他特定形式的复杂性方面可能意味着什么,甚至没有暗示哪些参数最严重地影响算法的行为。
To contemporary readers, Kruskal’s presentation is remarkable for its lucidity—the description and proof of the algorithm are as spare and elegant as in any more recent treatment—and for the complete absence of any analysis. Kruskal describes his algorithm as “practical” without suggesting what that might mean in terms of runtime or any other specific form of complexity, or even what parameters most seriously affect the algorithm’s behavior.
解构“实用”一词可以概括算法分析的非凡历史。又过了十年,埃德蒙兹和科巴姆才将多项式运行时间描述为实用性直观概念的粗略代理(参见第 333 页)。但随后 Donald Knuth 提出了更细粒度的算法分析的标准(参见第 43 章)。考虑到使用朴素方法按顺序检索边所需的时间,Kruskal 算法的时间复杂度为O ( m log n ),其中m是边的数量,n是顶点的数量。但这还不是故事的结局。Knuth 的学生 Robert Tarjan (1975) 对一种用于跟踪集合及其并集的已知简单算法进行了出色的分析。使用这些数据结构,在某些普遍满足的假设下,Kruskal 算法的复杂度为O ( m α ( n ))。这里α ( n ) 是阿克曼函数的反函数——一个无界函数,其增长速度非常缓慢,以至于对于所有n小于宇宙中粒子数的情况,α ( n ) < 5。出于所有意图和目的,克鲁斯卡尔算法的良好实现的运行时间与边的数量成简单的比例增长。
Deconstructing that term “practical” recapitulates the remarkable history of algorithm analysis. It would take another decade before Edmonds and Cobham would characterize polynomial running time as a rough proxy for the intuitive notion of practicality (see page 333). But then Donald Knuth raised the bar for finer-grained algorithm analysis (see chapter 43). Taking into account the time required to retrieve the edges in order using naïve methods, the time complexity of Kruskal’s algorithm is O(mlog n), where m is the number of edges and n is the number of vertices. But that is not the end of the story. Knuth’s student Robert Tarjan (1975) developed a remarkable analysis of a known simple algorithm for keeping track of sets and their unions. Using those data structures, the complexity of Kruskal’s algorithm turns out to be O(mα(n)) under certain commonly satisfied assumptions. Here α(n) is the inverse of Ackermann’s function—an unbounded function that grows so slowly that α(n) < 5 for all n less than the number of particles in the universe. For all intents and purposes, the running time of a good implementation of Kruskal’s algorithm grows in simple proportion to the number of edges.
克鲁斯卡尔注意到最小生成树问题和旅行商问题之间的优雅相似性,但没有在后者的有效解决方案方面取得任何进展。直到 20 世纪 70 年代初,这些问题之间的巨大差异才得到澄清,当时卡普表明旅行商问题是𝒩 𝒫 -完全问题。(无向汉密尔顿电路,第 354 页,是旅行商问题的一个特例,其中相邻顶点之间的距离均为1。)
Kruskal notes the elegant similarity between the minimum spanning tree problem and the traveling salesman problem, without making any progress toward an efficient solution of the latter. The dramatic difference between these problems would not be clarified until the early 1970s, when Karp showed that the traveling salesman problem is 𝒩𝒫-complete. (UNDIRECTED HAMILTON CIRCUIT, page 354, is a special case of the traveling salesman problem in which distances between adjacent vertices are all 1.)
另一种寻找最小生成树的主要算法被称为 Prim 算法,以纪念 Robert Clay Prim (1957) 的发表——事实上,它是 Kruskal 构造 B 的一个特例,其中他称为V 的集合仅包含一个顶点。该算法通过添加与现有树相邻的边来生长一棵树,从最短的边开始(并且不包括会创建循环的边),如构造 A,这是现在称为贪婪策略的一个示例。Prim 的算法被 Edsger Dijkstra (1959) 在开发最短路径问题算法的过程中重新发现,尽管事实上 Prim、Kruskal Construction B 和 Dijkstra 都被另一位捷克人 Vojtěch Jarník (1930) 的早期工作所预见。 )。(Kruskal 还包括第三个没有通用名称的贪婪算法 A′。)Warshall (1962) 的传递闭包算法和 Floyd (1962) 的全对最短路径算法都是动态规划的示例,标志着动态规划又向前迈进了一步。从运筹学到计算机科学的道路。
The other leading algorithm for finding the minimum spanning tree is known as Prim’s algorithm in honor of its publication by Robert Clay Prim (1957)—in fact it is a special case of Kruskal’s Construction B in which the set he refers to as V includes only one vertex. This algorithm grows a single tree by adding edges adjacent to the existing tree, starting with the shortest (and not including edges that would create a cycle)—like Construction A, an example of what is now known as a greedy strategy. Prim’s algorithm was rediscovered by Edsger Dijkstra (1959) in the course of developing his algorithm for the shortest paths problem, though in fact Prim, Kruskal Construction B, and Dijkstra were all anticipated by the much earlier work of another Czech, Vojtěch Jarník (1930). (Kruskal also includes a third greedy algorithm A′ that has no common name.) The transitive closure algorithm of Warshall (1962) and the all-pairs shortest path algorithm of Floyd (1962) are both examples of dynamic programming, marking another step along the path from operations research to computer science.
约瑟夫·克鲁斯卡尔 (Joseph Kruskal) 是一个杰出数学家族的成员。他的父亲是一名犹太移民,1892 年从波罗的海来到纽约,后来成为一名成功的皮货商。他的母亲是将折纸引入美国的先驱。约瑟夫的大哥威廉成为芝加哥大学的统计学教授,二哥马丁·大卫成为普林斯顿大学的天体物理科学和应用数学教授。他的侄子克莱德·克鲁斯卡尔是马里兰大学计算机科学教授。约瑟夫在普林斯顿大学攻读研究生时撰写了这篇论文,并于 1954 年获得博士学位,并在美国海军研究办公室的支持下在普林斯顿大学工作期间发表。1959 年,他加入贝尔实验室,并在那里度过了他职业生涯的大部分时间。他在统计学和统计语言学以及我们现在所知的计算机科学领域做出了重要的工作。
Joseph Kruskal was a member of a remarkable mathematical family. His father, a Jewish immigrant from the Baltics to New York in 1892, became a successful furrier; his mother was a pioneer in introducing origami to America. Joseph’s oldest brother, William, became a professor of statistics at the University of Chicago, and a second brother, Martin David, was a professor of astrophysical sciences and applied mathematics at Princeton. His nephew Clyde Kruskal is a professor of computer science at the University of Maryland. Joseph wrote this paper while he was a graduate student at Princeton, where he received his PhD in 1954, and it was published while he was working at Princeton under the auspices of the U.S. Office of Naval Research. In 1959 he joined Bell Labs, where he spent most of his career. He did important work in statistics and statistical linguistics as well as in the field we now know as computer science.
几年前,《Borůvka》(1926)的打字翻译(来源不明)引起了一些人的兴趣。本文致力于以下定理:如果(有限)连通图的每条边都附加一个正实数(边的长度),并且如果这些长度都不同,则在生成树之间(德语:Gerüst )图中只有一个,其边之和最小;也就是说,图的最短生成树是唯一的。(如果子图包含图的所有顶点,则子图跨越图。实际上,在 Borůvka [1926] 中,该定理是根据图的“长度矩阵”来陈述和证明的,即矩阵∥ a ij ∥其中a ij是连接顶点i和j 的边的长度。当然,假设对于所有i和j , a ij = a ji且a ii = 0。 )
SEVERAL years ago a typewritten translation (of obscure origin) of Borůvka (1926) raised some interest. This paper is devoted to the following theorem: If a (finite) connected graph has a positive real number attached to each edge (the length of the edge), and if these lengths are all distinct, then among the spanning trees (German: Gerüst) of the graph there is only one, the sum of whose edges is a minimum; that is, the shortest spanning tree of the graph is unique. (A subgraph spans a graph if it contains all the vertices of the graph. Actually in Borůvka [1926] this theorem is stated and proved in terms of the “matrix of lengths” of the graph, that is, the matrix ∥aij∥ where aij is the length of the edge connecting vertices i and j. Of course, it is assumed that aij = aji and that aii = 0 for all i and j.)
Borůvka (1926) 中的证明基于构造最小长度生成子树的合理方法。兴趣主要在于这种结构,因为它是一个问题(下面的问题 1)的解决方案,该问题表面上与著名的旅行商问题的一个版本(下面的问题 2)密切相关。
The proof in Borůvka (1926) is based on a not unreasonable method of constructing a spanning subtree of minimum length. It is in this construction that the interest largely lies, for it is a solution to a problem (Problem 1 below) which on the surface is closely related to one version (Problem 2 below) of the well-known traveling salesman problem.
问题1.给出构造最小长度生成子树的实用方法。
PROBLEM 1. Give a practical method for constructing a spanning subtree of minimum length.
问题2. 给出构造最小长度无分支生成子树的实用方法。
PROBLEM 2. Give a practical method for constructing an unbranched spanning subtree of minimum length.
Borůvka (1926) 中给出的结构过于复杂。在本文中,我给出了解决问题 1 的几个更简单的构造,并展示了如何使用这些构造之一来证明 Borůvka (1926) 定理。很可能任何解决问题 1 的构造都可以用来证明这个定理。
The construction given in Borůvka (1926) is unnecessarily elaborate. In the present paper I give several simpler constructions which solve Problem 1, and I show how one of these constructions may be used to prove the theorem of Borůvka (1926). Probably it is true that any construction which solves Problem 1 may be used to prove this theorem.
首先我想指出,假设给定的连通图G是完备的,即每对顶点都由一条边连接,这并不失一般性。因为如果G的任何边“缺失”,则可以插入很长的边,并且这不会以与当前目的相关的任何方式改变图。此外,将缺失的边缘视为无限长度的边缘是可能且直观地吸引人的。
First I would like to point out that there is no loss of generality in assuming that the given connected graph G is complete, that is, that every pair of vertices is connected by an edge. For if any edge of G is “missing,” an edge of great length may be inserted, and this does not alter the graph in any way which is relevant to the present purposes. Also, it is possible and intuitively appealing to think of missing edges as edges of infinite length.
构造A. 尽可能多地执行以下步骤: 在G尚未选择的边中,选择不与已选择的边形成任何环的最短边。显然,最终选择的边集必须形成G的生成树,并且实际上它形成最短生成树。
CONSTRUCTION A. Perform the following step as many times as possible: Among the edges of G not yet chosen, choose the shortest edge which does not form any loops with those edges already chosen. Clearly the set of edges eventually chosen must form a spanning tree of G, and in fact it forms a shortest spanning tree.
构造B. 设V是G顶点的任意但固定(非空)子集。然后尽可能多次地执行以下步骤:在G的尚未选择但连接到V的顶点或已选择的边的边中,选择不与已选择的边。显然,最终选择的边集形成了G的生成树,并且实际上它形成了最短生成树。如果V是G的所有顶点的集合,则构造 B 简化为构造 A。
CONSTRUCTION B. Let V be an arbitrary but fixed (nonempty) subset of the vertices of G. Then perform the following step as many times as possible: Among the edges of G which are not yet chosen but which are connected either to a vertex of V or to an edge already chosen, pick the shortest edge which does not form any loops with the edges already chosen. Clearly the set of edges eventually chosen forms a spanning tree of G, and in fact it forms a shortest spanning tree. In case V is the set of all vertices of G, then Construction B reduces to Construction A.
施工A′ 。该方法在某种意义上与 A 是双重的。尽可能多次执行以下步骤:在尚未选择的边中,选择删除不会断开它们的最长边。显然,最终未选择的边集形成了G的生成树,并且实际上它形成了最短生成树。我不清楚结构 B 是否总体上具有与此类似的对偶。
CONSTRUCTION A′. This method is in some sense dual to A. Perform the following step as many times as possible: Among the edges not yet chosen, choose the longest edge whose removal will not disconnect them. Clearly the set of edges not eventually chosen forms a spanning tree of G, and in fact it forms a shortest spanning tree. It is not clear to me whether Construction B in general has a dual analogous to this.
在展示如何使用构造 A 来证明 Borůvka (1926) 定理之前,我发现将图论的许多基本事实组合成一个定理是很方便的。读者应该毫不费力地说服自己这些都是真的。出于美观的原因,我所说的远远超出了我的需要。
Before showing how Construction A may be used to prove the theorem of Borůvka (1926), I find it convenient to combine into a theorem a number of elementary facts of graph theory. The reader should have no trouble convincing himself that these are true. For aesthetic reasons, I state considerably more than I need.
初步定理。如果 G 是一个有 n 个顶点的连通图,并且 T 是 G 的子图,则以下条件全部等价:
PRELIMINARY THEOREM. If G is a connected graph with n vertices, and T is a subgraph of G, then the following conditions are all equivalent:
(a) T是G的生成树;
(a) T is a spanning tree of G;
(b) T是G中的最大森林;
(b) T is a maximal forest in G;
(c) T是G的最小连通生成图;
(c) T is a minimal connected spanning graph of G;
(d) T 是一个有 n − 1 条边的森林;
(d) T is a forest with n − 1 edges;
(e) T 是具有 n − 1条边的连通生成图。
(e) T is a connected spanning graph with n − 1 edges.
(如果一个图不包含在同类型的任何较大图中,则该图是“最大”;如果它不包含同类型的任何较小图,则该图是“最小”。“森林”是不包含)要证明的定理指出,如果G的边都具有不同的长度,则T是唯一的,其中T是G的任何最短生成树。显然,T可以被重新定义为任何具有n − 1 条边的最短森林。
(A graph is “maximal” if it is not contained in any larger graph of the same sort; it is “minimal” if it does not contain any smaller graph of the same sort. A “forest” is a graph which does not have any loops.) The theorem to be proved states that if the edges of G all have distinct lengths, then T is unique, where T is any shortest spanning tree of G. Clearly T may be redefined as any shortest forest with n − 1 edges.
在构造 A 中,让所选边按所选顺序称为a 1 , … , a n -1 。令A i为由边a 1到a i组成的森林。证明T = A n −1。根据G的边具有不同长度的假设,很容易看出构造 A 以独特的方式进行。因此A i是唯一的,因此T也是唯一的。
In Construction A, let the edges chosen be called a1, …, an−1 in the order chosen. Let Ai be the forest consisting of edges a1 through ai. It will be proved that T = An−1. From the hypothesis that the edges of G have distinct lengths, it is easily seen that Construction A proceeds in a unique manner. Thus the Ai are unique, and hence also T.
仍有待证明T = A n −1。如果T ≠ A n -1,则令a i为A n -1中不在T中的第一条边。那么a 1 , … , a i -1就在T中。T ∪ a i必须恰好有一个循环,其中必须包含a i。该循环还必须包含一些不在A n −1中的边e。那么T ∪ a i − e是一个有n − 1 条边的森林。
It remains to prove that T = An−1. If T ≠ An−1, let ai be the first edge of An−1 which is not in T. Then a1, …, ai−1 are in T. T ∪ ai must have exactly one loop, which must contain ai. This loop must also contain some edge e which is not in An−1. Then T ∪ ai − e is a forest with n − 1 edges.
由于A i −1 ∪ e包含在最后命名的森林中,因此它是一个森林,因此从构造 A,
As Ai−1 ∪ e is contained in the last named forest, it is a forest, so from Construction A,
但T ∪ a i − e比T短。这与T的定义相矛盾,因此间接证明T = A n −1。
But then T ∪ ai −e is shorter than T. This contradicts the definition of T, and hence proves indirectly that T = An−1.
经美国数学会许可,转载自 Kruskal (1956)。
Reprinted from Kruskal (1956), with permission from the American Mathematical Society.
弗兰克·罗森布拉特(Frank Rosenblatt,1928-1971)的“感知器”既是人工智能迈出的戏剧性的早期一步,也是智力竞争的暂时牺牲品,也是科学研究趋势流行的一个案例研究。罗森布拉特在康奈尔大学接受心理学家教育,1956 年获得博士学位后加入康奈尔航空实验室,开始开发人工感知机制。在那里,他接触到了当时被视为大型计算机的 IBM 704。他第一次接触到感知器1957 年的一项早期实验是通过强化学习来训练人工模式识别系统以提高其性能。这篇 1958 年的论文描述了他的改进模型,其中包含各种可调参数和常用于描述物理系统的数学。罗森布拉特作为一名心理学教授,在神经科学领域从事这一领域的研究和其他截然不同的实验,直到他在四十三岁生日时因划船事故去世。
The “perceptron” of Frank Rosenblatt (1928–1971) is at once a dramatic early step toward artificial intelligence, a temporary casualty of intellectual rivalry, and a case study in the faddishness of scientific research trends. Educated at Cornell as a psychologist, Rosenblatt began developing an artificial perceptual mechanism when he joined Cornell’s Aeronautical Laboratory after receiving his PhD in 1956. There he had access to what then counted as a large computer, the IBM 704. His first cut at a perceptron, in 1957, was an early experiment in training an artificial pattern-recognition system to improve its performance through reinforcement learning. This 1958 paper describes his refined model, complete with various tunable parameters and mathematics commonly used to describe physical systems. Rosenblatt pursued this line of research and other dramatically different experiments in neuroscience as a psychology professor until his death in a boating accident on his forty-third birthday.
罗森布拉特走在了他的时代的前面,从他的发明到今天经过训练的神经网络的传承绝非直接的。1969 年,麻省理工学院人工智能创始人 Marvin Minsky 和他的心理学家同事 Seymour Papert 发表了《感知器》 (Perceptrons )(Minsky 和 Papert,1969),研究了 Rosenblatt 模型的受限“单层”版本的威力。感知器正确地显示了单层网络的局限性。作为一个科学问题,这应该不足以偏离对完整感知器模型的研究,但它似乎产生了这种效果,也许是因为它是在人工智能产生重要实验结果的进展缓慢得令人沮丧的时刻发表的。明斯基(在布朗克斯科学高中比罗森布拉特低一年级)和他的麻省理工学院同事(其中一些很快将前往斯坦福大学)倡导一种不同的人工智能方法,通过逻辑和其他符号系统,而不是通过模仿人工智能的架构人类的大脑。
Rosenblatt was ahead of his time, and the line of descent from his invention to today’s trained neural nets is anything but direct. In 1969 Marvin Minsky, a founding father of artificial intelligence at MIT, and Seymour Papert, his psychologist colleague there, published Perceptrons (Minsky and Papert, 1969), which studied the power of a restricted “one-layer” version of Rosenblatt’s model. Perceptrons correctly showed the limitations of one-layer nets. As a scientific matter, that should not have been enough to sidetrack research on the full perceptron model, but it seems to have had that effect, perhaps because it was published at a moment of frustratingly slow progress in producing nontrivial experimental results in artificial intelligence. Minsky (who had been one grade behind Rosenblatt at Bronx High School of Science) and his MIT colleagues (some soon to depart for Stanford) were championing a different approach to AI, through logic and other symbolic systems rather than through mimicking the architecture of the human brain.
Rosenblatt 的早期研究始于 20 世纪 80 年代的封存状态,当时计算机的功能开始强大到足以模拟多层网络并在大量数据集上对其进行训练。因此,神经计算模型现在引起了人工智能研究人员的极大兴趣。
Rosenblatt’s early research came out of mothballs in the 1980s, when computers began to be powerful enough to simulate multi-layer nets and to train them on sizable data sets. As a result neural computing models are now of great interest to researchers in artificial intelligence.
如果我们最终要了解高等生物的感知识别、概括、回忆和思考的能力,我们必须首先回答三个基本问题:
IF we are eventually to understand the capability of higher organisms for perceptual recognition, generalization, recall, and thinking, we must first have answers to three fundamental questions:
1. 生物系统如何感知或检测有关物理世界的信息?
1. How is information about the physical world sensed, or detected, by the biological system?
2. 信息以什么形式存储或记忆?
2. In what form is information stored, or remembered?
3. 存储或记忆中包含的信息如何影响识别和行为?
3. How does information contained in storage, or in memory, influence recognition and behavior?
这些问题中的第一个问题属于感觉生理学领域,并且是唯一一个已经获得相当程度理解的问题。本文将主要关注第二个问题和第三个问题,这些问题仍然存在大量的猜测,并且神经生理学目前提供的少数相关事实尚未整合成可接受的理论。
The first of these questions is in the province of sensory physiology, and is the only one for which appreciable understanding has been achieved. This article will be concerned primarily with the second and third questions, which are still subject to a vast amount of speculation, and where the few relevant facts currently supplied by neurophysiology have not yet been integrated into an acceptable theory.
关于第二个问题,双方保持了两种不同的立场。第一个建议认为,感官信息的存储是以编码表示或图像的形式进行的,在感官刺激和存储的模式之间存在某种一对一的映射。根据这一假设,如果一个人理解了神经系统的代码或“线路图”,原则上,一个人应该能够通过从生物体拥有的“记忆痕迹”重建原始的感觉模式来准确地发现生物体记住了什么。左边,就像我们可以冲洗照相负片,或者翻译数字计算机“内存”中的电荷模式一样。这一假设因其简单性和易懂性而颇具吸引力,并且围绕编码、表征记忆的理念开发了一大堆理论大脑模型(Culbertson,1950、1956;Köhler,1951;Rashevsky,1938)。另一种方法源于英国经验主义的传统,它冒着这样的猜测:刺激的图像可能永远不会真正被记录下来,中枢神经系统只是充当一个复杂的转换网络,其中保留采取新的形式。活动中心之间的联系或路径。在这一立场的许多最新发展中(例如赫布的“细胞组装”和赫尔的“皮质预期目标反应”),与刺激相关的“反应”可能完全包含在中枢神经系统本身内。在这种情况下,响应代表“想法”而不是行动。这种方法的重要特征是,根据一些允许其稍后重建的代码,从来不存在任何简单的刺激到记忆的映射。无论保留什么信息,都必须以某种方式存储为特定响应的偏好;即,信息包含在连接或关联中,而不是地形表示中。(对于本演示文稿的其余部分,术语“响应”应理解为是指生物体的任何可区分的状态,这可能涉及也可能不涉及外部可检测的肌肉活动。例如,中枢神经系统中某些细胞核的激活,根据这个定义可以构成一个响应。)
With regard to the second question, two alternative positions have been maintained. The first suggests that storage of sensory information is in the form of coded representations or images, with some sort of one-to-one mapping between the sensory stimulus and the stored pattern. According to this hypothesis, if one understood the code or “wiring diagram” of the nervous system, one should, in principle, be able to discover exactly what an organism remembers by reconstructing the original sensory patterns from the “memory traces” which they have left, much as we might develop a photographic negative, or translate the pattern of electrical charges in the “memory” of a digital computer. This hypothesis is appealing in its simplicity and ready intelligibility, and a large family of theoretical brain models has been developed around the idea of a coded, representational memory (Culbertson, 1950, 1956; Köhler, 1951; Rashevsky, 1938). The alternative approach, which stems from the tradition of British empiricism, hazards the guess that the images of stimuli may never really be recorded at all, and that the central nervous system simply acts as an intricate switching network, where retention takes the form of new connections, or pathways, between centers of activity. In many of the more recent developments of this position (Hebb’s “cell assembly,” and Hull’s “cortical anticipatory goal response,” for example) the “responses” which are associated to stimuli may be entirely contained within the CNS itself. In this case the response represents an “idea” rather than an action. The important feature of this approach is that there is never any simple mapping of the stimulus into memory, according to some code which would permit its later reconstruction. Whatever information is retained must somehow be stored as a preference for a particular response; i.e., the information is contained in connections or associations rather than topographic representations. (The term response, for the remainder of this presentation, should be understood to mean any distinguishable state of the organism, which may or may not involve externally detectable muscular activity. The activation of some nucleus of cells in the central nervous system, for example, can constitute a response, according to this definition.)
与这两种关于信息保留方法的立场相对应,对于第三个问题,即存储的信息发挥其作用的方式,存在两种假设。对当前活动的影响。“编码记忆理论家”被迫得出这样的结论:对任何刺激的识别都涉及将存储内容与传入的感觉模式进行匹配或系统比较,以确定当前刺激是否以前见过,并确定适当的反应来自有机体。另一方面,经验主义传统的理论家基本上将第三个问题的答案与第二个问题的答案结合起来:因为存储的信息采取新连接的形式,或神经系统中的传输通道(或创造)功能上等同于新连接的条件),因此新的刺激将利用这些已创建的新路径,自动激活适当的反应,而不需要任何单独的过程来识别或识别。
Corresponding to these two positions on the method of information retention, there exist two hypotheses with regard to the third question, the manner in which stored information exerts its influence on current activity. The “coded memory theorists” are forced to conclude that recognition of any stimulus involves the matching or systematic comparison of the contents of storage with incoming sensory patterns, in order to determine whether the current stimulus has been seen before, and to determine the appropriate response from the organism. The theorists in the empiricist tradition, on the other hand, have essentially combined the answer to the third question with their answer to the second: since the stored information takes the form of new connections, or transmission channels in the nervous system (or the creation of conditions which are functionally equivalent to new connections), it follows that the new stimuli will make use of these new pathways which have been created, automatically activating the appropriate response without requiring any separate process for their recognition or identification.
这里提出的理论对于这些问题采取了经验主义或“联结主义”的立场。该理论是针对一个假设的神经系统或称为感知器的机器而开发的。感知器旨在说明一般智能系统的一些基本属性,而不会太深地陷入特定生物有机体所适用的特殊且通常未知的条件中。感知器和生物系统之间的类比对于读者来说应该是显而易见的。
The theory to be presented here takes the empiricist, or “connectionist” position with regard to these questions. The theory has been developed for a hypothetical nervous system, or machine, called a perceptron. The perceptron is designed to illustrate some of the fundamental properties of intelligent systems in general, without becoming too deeply enmeshed in the special, and frequently unknown, conditions which hold for particular biological organisms. The analogy between the perceptron and biological systems should be readily apparent to the reader.
在过去的几十年里,符号逻辑、数字计算机和开关理论的发展给许多理论家留下了深刻的印象,即神经元和构成计算机的简单开关单元之间的功能相似性,并提供了必要的分析方法。用这些元素表示高度复杂的逻辑功能。其结果是产生了大量的大脑模型,这些模型简单地相当于执行特定算法(代表“回忆”、刺激比较、转换和各种分析)以响应刺激序列的逻辑设计——例如,Rashevsky(1938);麦卡洛克(1951);麦卡洛克和皮茨 (1943);卡尔伯森 (1950);克莱恩(1951);明斯基(1956)。相对少数的理论家,如 Ashby (1952)、von Neumann (1951) 和 von Neumann (1956),一直关注如何使包含许多随机连接的不完美神经网络可靠地执行的问题这些功能可以用理想化的接线图来表示。不幸的是,符号逻辑和布尔代数的语言不太适合此类研究。在只能表征总体组织且未知精确结构的系统中,需要一种合适的语言来对事件进行数学分析,这导致作者根据概率论而不是符号逻辑来制定当前模型。
During the last few decades, the development of symbolic logic, digital computers, and switching theory has impressed many theorists with the functional similarity between a neuron and the simple on-off units of which computers are constructed, and has provided the analytical methods necessary for representing highly complex logical functions in terms of such elements. The result has been a profusion of brain models which amount simply to logical contrivances for performing particular algorithms (representing “recall,” stimulus comparison, transformation, and various kinds of analysis) in response to sequences of stimuli—e.g., Rashevsky (1938); McCulloch (1951); McCulloch and Pitts (1943); Culbertson (1950); Kleene (1951); Minsky (1956). A relatively small number of theorists, like Ashby (1952), von Neumann (1951), and von Neumann (1956), have been concerned with the problems of how an imperfect neural network, containing many random connections, can be made to perform reliably those functions which might be represented by idealized wiring diagrams. Unfortunately, the language of symbolic logic and boolean algebra is less well suited for such investigations. The need for a suitable language for the mathematical analysis of events in systems where only the gross organization can be characterized, and the precise structure is unknown, has led the author to formulate the current model in terms of probability theory rather than symbolic logic.
上面提到的理论家主要关心这样的问题:感知和回忆等功能如何通过任何类型的确定性物理系统来实现,而不是大脑实际上是如何完成的。所产生的模型在一些重要方面都失败了(缺乏等电位、缺乏神经经济学、连接和同步要求的过度特异性、足以细胞放电的刺激的不切实际的特异性、假设变量或功能特征没有已知的神经相关性,等)以对应于生物系统。这一方针的支持者坚持认为一旦展示了如何使任何种类的物理系统感知和识别刺激,或执行其他类似大脑的功能,只需要对现有原理进行完善或修改即可了解更现实的神经系统的工作原理,并消除上述缺点。另一方面,作者的立场是,这些缺陷是如此之大,以至于仅仅对已经提出的原则进行细化或改进永远无法解释生物智能;明确指出了原则上的差异。这里要总结的统计可分离性理论(Rosenblatt,1958b)似乎为所有这些困难提供了原则上的解决方案。
The theorists referred to above were chiefly concerned with the question of how such functions as perception and recall might be achieved by a deterministic physical system of any sort, rather than how this is actually done by the brain. The models which have been produced all fail in some important respects (absence of equipotentiality, lack of neuroeconomy, excessive specificity of connections and synchronization requirements, unrealistic specificity of stimuli sufficient for cell firing, postulation of variables or functional features with no known neurological correlates, etc.) to correspond to a biological system. The proponents of this line of approach have maintained that, once it has been shown how a physical system of any variety might be made to perceive and recognize stimuli, or perform other brainlike functions, it would require only a refinement or modification of existing principles to understand the working of a more realistic nervous system, and to eliminate the shortcomings mentioned above. The writer takes the position, on the other hand, that these shortcomings are such that a mere refinement or improvement of the principles already suggested can never account for biological intelligence; a difference in principle is clearly indicated. The theory of statistical separability (Rosenblatt, 1958b), which is to be summarized here, appears to offer a solution in principle to all of these difficulties.
那些理论家——Hebb (1949);米尔纳(1957);埃克尔斯(1953);哈耶克(Hayek,1952)——他们更直接地关注生物神经系统及其在自然环境中的活动,而不是形式上的类似机器,他们的表述通常不太精确,分析也远非严谨,因此通常很难评估他们所描述的系统是否能够在现实的神经系统中真正发挥作用,以及必要和充分的条件是什么。在这里,缺乏一种在熟练程度上可与网络分析师的布尔代数相媲美的分析语言一直是主要障碍之一。该小组的贡献也许应该被视为对寻找和研究内容的建议,而不是其本身的完整理论体系。从这个角度来看,从以下理论的角度来看,最具启发性的著作是赫布和哈耶克的著作。
Those theorists—Hebb (1949); Milner (1957); Eccles (1953); Hayek (1952)—who have been more directly concerned with the biological nervous system and its activity in a natural environment, rather than with formally analogous machines, have generally been less exact in their formulations and far from rigorous in their analysis, so that it is frequently hard to assess whether or not the systems that they describe could actually work in a realistic nervous system, and what the necessary and sufficient conditions might be. Here again, the lack of an analytic language comparable in proficiency to the boolean algebra of the network analysts has been one of the main obstacles. The contributions of this group should perhaps be considered as suggestions of what to look for and investigate, rather than as finished theoretical systems in their own right. Seen from this viewpoint, the most suggestive work, from the standpoint of the following theory, is that of Hebb and Hayek.
Hebb (1949) 阐述了这一立场;哈耶克(1952);厄特利(1956);尤其是感知器理论所基于的 Ashby (1952),可以通过以下假设来概括:
The position, elaborated by Hebb (1949); Hayek (1952); Uttley (1956); Ashby (1952), in particular, upon which the theory of the perceptron is based, can be summarized by the following assumptions:
1. 一种生物体与另一种生物体之间参与学习和识别的神经系统的物理连接并不相同。在出生时,最重要的网络的构建很大程度上是随机的,受到最少数量的遗传限制。
1. The physical connections of the nervous system which are involved in learning and recognition are not identical from one organism to another. At birth, the construction of the most important networks is largely random, subject to a minimum number of genetic constraints.
2、原有的细胞相连系统具有一定的可塑性;经过一段时间的神经活动后,由于神经元本身发生了一些相对持久的变化,施加到一组细胞的刺激引起其他组细胞反应的概率可能会发生变化。
2. The original system of connected cells is capable of a certain amount of plasticity; after a period of neural activity, the probability that a stimulus applied to one set of cells will cause a response in some other set is likely to change, due to some relatively long-lasting changes in the neurons themselves.
3. 通过暴露于大量刺激样本,那些最“相似”的刺激(在某种意义上必须根据特定的物理系统来定义)将倾向于形成通向同一组反应细胞的途径。那些明显“不同”的细胞往往会与不同组的反应细胞建立联系。
3. Through exposure to a large sample of stimuli, those which are most “similar” (in some sense which must be defined in terms of the particular physical system) will tend to form pathways to the same sets of responding cells. Those which are markedly “dissimilar” will tend to develop connections to different sets of responding cells.
4. 正面和/或负面强化(或起到此功能的刺激)的应用可能会促进或阻碍当前正在进行的任何连接的形成。
4. The application of positive and/or negative reinforcement (or stimuli which serve this function) may facilitate or hinder whatever formation of connections is currently in progress.
5.在这样的系统中,相似性在神经系统的某些水平上表现为相似刺激激活相同细胞组的倾向。相似性不是特定形式或几何类别的刺激的必要属性,而是取决于感知系统的物理组织,该组织通过与给定环境的交互而演变。系统的结构以及刺激环境的生态将影响并在很大程度上决定感知世界所划分的“事物”的类别。
5. Similarity, in such a system, is represented at some level of the nervous system by a tendency of similar stimuli to activate the same sets of cells. Similarity is not a necessary attribute of particular formal or geometrical classes of stimuli, but depends on the physical organization of the perceiving system, an organization which evolves through interaction with a given environment. The structure of the system, as well as the ecology of the stimulus-environment, will affect, and will largely determine, the classes of “things” into which the perceptual world is divided.
典型的光感知器(响应光学模式作为刺激的感知器)的组织如图 18.1所示。其组织规则如下:
The organization of a typical photo-perceptron (a perceptron responding to optical patterns as stimuli) is shown in Figure 18.1. The rules of its organization are as follows:
1. 刺激撞击视网膜的感觉单元(S 点),在某些模型中,这些感觉单元被假定为全有或全无的基础上做出响应,或者在其他模型中,其脉冲幅度或频率与刺激强度成比例。在这里考虑的模型中,将假设全有或全无的响应。
1. Stimuli impinge on a retina of sensory units (S-points), which are assumed to respond on an all-or-nothing basis, in some models, or with a pulse amplitude or frequency proportional to the stimulus intensity, in other models. In the models considered here, an all-or-nothing response will be assumed.
2. 脉冲被传输到“投影区域”(A I)中的一组关联单元(A 单元)。在某些模型中,该投影区域可能被省略,其中视网膜直接连接到关联区域(A II)。投影区域中的每个细胞都接收来自感觉点的多个连接。向特定 A 单元传输脉冲的 S 点集合称为该 A 单元的原点。这些原点对 A 单元的影响可能是兴奋性的,也可能是抑制性的。如果兴奋性和抑制性脉冲强度的代数和等于或大于A 单元的阈值 ( θ ),则 A 单元会再次在全有或全无的基础上触发(或者,在某些模型中,这里不会考虑,频率取决于接收到的脉冲的净值)。投影区域中 A 单元的原点往往聚集或集中在某个中心点周围,对应于每个 A 单元。随着视网膜与 A 单元中心点的距离增加,原点的数量呈指数下降。(这种分布似乎得到了生理证据的支持,并且在轮廓检测中具有重要的功能目的。)
2. Impulses are transmitted to a set of association cells (A-units) in a “projection area” (AI). This projection area may be omitted in some models, where the retina is connected directly to the association area (AII). The cells in the projection area each receive a number of connections from the sensory points. The set of S-points transmitting impulses to a particular A-unit will be called the origin points of that A-unit. These origin points may be either excitatory or inhibitory in their effect on the A-unit. If the algebraic sum of excitatory and inhibitory impulse intensities is equal to or greater than the threshold (θ) of the A-unit, then the A-unit fires, again on an all-or-nothing basis (or, in some models, which will not be considered here, with a frequency which depends on the net value of the impulses received). The origin points of the A-units in the projection area tend to be clustered or focalized, about some central point, corresponding to each A-unit. The number of origin points falls off exponentially as the retinal distance from the central point for the A-unit in question increases. (Such a distribution seems to be supported by physiological evidence, and serves an important functional purpose in contour detection.)
3. 在投影区域和关联区域(A II)之间,假设连接是随机的。也就是说,A II组中的每个 A 单元从 A I组中的原点接收一定数量的纤维,但这些原点随机分散在整个投影区域中。除了连接分布之外,A II单元与 A I单元相同,并且在相似的条件下做出响应。
3. Between the projection area and the association area (AII), connections are assumed to be random. That is, each A-unit in the AII set receives some number of fibers from origin points in the AI set, but these origin points are scattered at random throughout the projection area. Apart from their connection distribution, the AII units are identical with the AI units, and respond under similar conditions.
4. “响应” R 1、R 2、...、R n是细胞(或细胞组),其响应方式与 A 单元大致相同。每个响应通常都有大量随机位于 A II集中的原点。一组 A 单元,传输脉冲以产生特定响应将被称为该响应的源集。(响应的源集与其在 A 系统中的起始点集相同。)图 18.1中的箭头表示通过网络的传输方向。请注意,直到 A II所有连接都是转发的,并且没有反馈。当我们看到 A II和 R 单元之间的最后一组连接时,在两个方向上都建立了连接。在大多数感知器模型中,控制反馈连接的规则可以是以下任意一种:
4. The “responses,” R1, R2, …, Rn are cells (or sets of cells) which respond in much the same fashion as the A-units. Each response has a typically large number of origin points located at random in the AII set. The set of A-units transmitting impulses to a particular response will be called the source-set for that response. (The source-set of a response is identical to its set of origin points in the A-system.) The arrows in Figure 18.1 indicate the direction of transmission through the network. Note that up to AII all connections are forward, and there is no feedback. When we come to the last set of connections, between AII and the R-units, connections are established in both directions. The rule governing feedback connections, in most models of the perceptron, can be either of the following alternatives:
(a) 每个响应都与它自己的源集中的细胞有兴奋性反馈连接,或者
(a) Each response has excitatory feedback connections to the cells in its own source-set, or
(b) 每个响应都与其自身源集的补充有抑制性反馈连接(即,它倾向于禁止任何不向其传输信息的关联细胞中的活动)。
(b) Each response has inhibitory feedback connections to the complement of its own source-set (i.e., it tends to prohibit activity in any association cells which do not transmit to it).
第一个规则在解剖学上似乎更合理,因为 R 单元可能与其各自的源集位于同一皮质区域,从而使得适当源集的 R 单元和 A 单元之间的相互激发高度可能。然而,替代规则 (b) 会导致更容易分析的系统,因此将假设用于此处评估的大多数系统。……
The first of these rules seems more plausible anatomically, since the R-units might be located in the same cortical area as their respective source-sets, making mutual excitation between the R-units and the A-units of the appropriate source-set highly probable. The alternative rule (b) leads to a more readily analyzed system, however, and will therefore be assumed for most of the systems to be evaluated here. …
图 18.1: 感知器的组织。
Figure 18.1: Organization of a perceptron.
感知器理论研究的主要结论可以概括如下:
The main conclusions of the theoretical study of the perceptron can be summarized as follows:
1. 在随机刺激的环境中,由随机连接的单元组成的系统,受到上述参数约束,可以学习将特定响应与特定刺激相关联。即使许多刺激与每个反应相关,它们仍然可以以比偶然更好的概率被识别,尽管它们可能彼此非常相似,并且可能激活系统的许多相同的感官输入。
1. In an environment of random stimuli, a system consisting of randomly connected units, subject to the parametric constraints discussed above, can learn to associate specific responses to specific stimuli. Even if many stimuli are associated to each response, they can still be recognized with a better-than-chance probability, although they may resemble one another closely and may activate many of the same sensory inputs to the system.
2. 在这样的“理想环境”中,随着学习刺激数量的增加,正确反应的概率向其原始随机水平减小。
2. In such an “ideal environment,” the probability of a correct response diminishes towards its original random level as the number of stimuli learned increases.
3. 在这样的环境下,不存在一概而论的基础。
3. In such an environment, no basis for generalization exists.
4. 在“差异化环境”中,每个反应都与一类不同的相互相关或“相似”刺激相关联,正确保留某些特定刺激的习得关联的概率通常接近于概率。随着系统学习的刺激数量的增加而渐近。通过增加系统中关联单元的数量,可以使该渐近线任意接近统一。
4. In a “differentiated environment,” where each response is associated to a distinct class of mutually correlated, or “similar” stimuli, the probability that a learned association of some specific stimulus will be correctly retained typically approaches a better-than-chance asymptote as the number of stimuli learned by the system increases. This asymptote can be made arbitrarily close to unity by increasing the number of association cells in the system.
5. 在差异化环境中,以前从未见过的刺激将被正确识别并关联到其适当类别的概率(正确概括的概率)与对先前强化的正确反应的概率接近相同的渐近线刺激。对于所讨论的刺激类别,如果满足不等式P c 12 < P a < P c 11 ,则该渐近线将比机会更好。
5. In the differentiated environment, the probability that a stimulus which has not been seen before will be correctly recognized and associated to its appropriate class (the probability of correct generalization) approaches the same asymptote as the probability of a correct response to a previously reinforced stimulus. This asymptote will be better than chance if the inequality Pc12 < Pa < Pc11 is met, for the stimulus classes in question.
6. 可以通过使用轮廓敏感投影区域和使用二元响应系统来提高系统的性能,其中每个响应或“位”对应于系统的某些独立特征或属性。刺激。
6. The performance of the system can be improved by the use of a contour-sensitive projection area, and by the use of a binary response system, in which each response, or “bit,” corresponds to some independent feature or attribute of the stimulus.
7. 在二价强化系统中,试错学习是可能的。
7. Trial-and-error learning is possible in bivalent reinforcement systems.
8. 刺激模式和反应的时间组织可以通过仅使用统计可分离性原始原理的扩展的系统来学习,而不会在系统的组织中引入任何主要的复杂性。
8. Temporal organizations of both stimulus patterns and responses can be learned by a system which uses only an extension of the original principles of statistical separability, without introducing any major complications in the organization of the system.
9. 感知器的内存是分布式的,从某种意义上说,任何关联都可能利用系统中很大一部分的单元,并且删除关联系统的一部分不会对系统的性能产生明显的影响。任何一种歧视或协会,但会开始在所有学术协会中表现为普遍缺陷
9. The memory of the perceptron is distributed, in the sense that any association may make use of a large proportion of the cells in the system, and the removal of a portion of the association system would not have an appreciable effect on the performance of any one discrimination or association, but would begin to show up as a general deficit in all learned associations
10. 对给定环境中存在的类别进行简单的认知设置、选择性回忆和自发识别是可能的。然而,对空间和时间关系的识别似乎代表了感知器形成认知抽象能力的限制。
10. Simple cognitive sets, selective recall, and spontaneous recognition of the classes present in a given environment are possible. The recognition of relationships in space and time, however, seems to represent a limit to the perceptron’s ability to form cognitive abstractions.
心理学家,特别是学习理论家现在可能会问:“除了赫尔、布什和莫斯特勒等人的定量理论或赫布等生理学理论已经取得的成就之外,目前的理论还取得了什么成就?” 当然,目前的理论仍然太原始,不能被视为现有人类学习理论的成熟竞争对手。尽管如此,作为初步估计,其主要成就可以表述如下:
Psychologists, and learning theorists in particular, may now ask: “What has the present theory accomplished, beyond what has already been done in the quantitative theories of Hull, Bush and Mosteller, etc., or physiological theories such as Hebb’s?” The present theory is still too primitive, of course, to be considered as a full-fledged rival of existing theories of human learning. Nonetheless, as a first approximation, its chief accomplishment might be stated as follows:
对于给定的组织模式(α、β或γ;Σ或μ ;单价或二价) ,学习、知觉辨别和泛化的基本现象可以完全根据六个基本物理参数来预测,即:
For a given mode of organization (α, β, or γ; Σ or μ; monovalent or bivalent) the fundamental phenomena of learning, perceptual discrimination, and generalization can be predicted entirely from six basic physical parameters, namely:
x:每个 A 单元的兴奋性连接数,
x: the number of excitatory connections per A-unit,
y:每个 A 单元的抑制连接数,
y: the number of inhibitory connections per A-unit,
θ: A 单元的预期阈值,
θ: the expected threshold of an A-unit,
ω:连接 A 单元的 R 单元的比例,
ω: the proportion of R-units to which an A-unit is connected,
N A:系统中 A 单元的数量,以及
NA: the number of A-units in the system, and
N R:系统中 R 单元的数量。
NR: the number of R-units in the system.
N A(感觉单元的数量)如果非常小就变得很重要。假设系统开始时所有单位都处于统一的价值状态;否则还需要初始值分布。上述每个参数都是一个明确定义的物理变量,可以单独测量,独立于我们试图预测的行为和感知现象。
NA (the number of sensory units) becomes important if it is very small. It is assumed that the system begins with all units in a uniform state of value; otherwise the initial value distribution would also be required. Each of the above parameters is a clearly defined physical variable, which is measurable in is own right, independently of the behavioral and perceptual phenomena which we are trying to predict.
作为其以物理变量为基础的直接结果,本系统在三个要点上远远超出了现有的学习和行为理论:简约性、可验证性、解释力和普遍性。让我们依次考虑这些要点。
As a direct consequence of its foundation on physical variables the present system goes far beyond existing learning and behavior theories in three main points: parsimony, verifiability, and explanatory power and generality. Let us consider each of these points in turn.
转载自罗森布拉特 (1958a)。
Reprinted from Rosenblatt (1958a).
诺伯特·维纳(Norbert Wiener,1894-1964 年)是哈佛大学斯拉夫语教师的儿子,也是一位神童。当他 11 岁时作为一名大学新生进入塔夫茨大学时,一则新闻报道称他为“世界上最杰出的男孩”(未知,1906 年)。维纳与记者谈论哲学,并讲拉丁语和希腊语。他用三年时间从塔夫茨大学毕业,获得数学学位,18 岁时获得哈佛大学哲学博士学位,并开始在麻省理工学院任教。
Norbert Wiener (1894–1964), the son of a Harvard Slavic instructor, was a child prodigy. When he entered Tufts as a college freshman at the age of 11, a news story described him as “the most remarkable boy in the world” (Unknown, 1906). Wiener conversed with the reporter about philosophy and spoke Latin and Greek. He graduated from Tufts in three years with a degree in mathematics, earned a PhD in philosophy from Harvard at age 18, and started teaching at MIT.
维纳在麻省理工学院的职业生涯持续了他的余生,与许多其他计算机革命先驱的职业生涯相交叉。20 年代,他与 Vannevar Bush 合作设计了一种求解微分方程的装置,并提出了使用真空管而不是机械连杆来构建该装置的先见之明。当他的注意力转向机器和生物之间的共性时,他与沃伦·麦卡洛克和沃尔特·皮茨合作,后者成为他的学术门生,直到三人之间的关系悲惨地崩溃(第80页)。
Wiener’s MIT career, which lasted the rest of his life, intersected that of many other pioneers of the computer revolution. In the 1920s, he worked with Vannevar Bush on the design of a device to solve differential equations and made the prescient suggestion to build it using vacuum tubes rather than mechanical linkages. As his attention turned to commonalities between machines and living things, he cooperated with Warren McCulloch and with Walter Pitts, who became his academic protegé until the relation between the three tragically collapsed (page 80).
维纳是他称之为“控制论”的领域之父,该领域研究“动物和机器的控制和交流”。关键的概念是“反馈”;控制论一词源自希腊语,意为操纵船只,就像操纵舵柄一样。维纳对系统各部分之间传递的消息以控制其行为的兴趣自然与克劳德·香农在信息论方面的工作有关。
Wiener was the father of a field he dubbed “cybernetics,” the study of “control and communication in the animal and the machine.” The crucial concept was “feedback”; the word cybernetics derives from the Greek word for steering a boat, as with a tiller. Wiener’s interest in messages passed between parts of a system to control its behavior naturally connected to the work of Claude Shannon on information theory.
维纳的主要数学工具是连续的和统计的,而不是离散的和数字的。他在反馈和控制方面的工作具有非常重要的军事意义,例如在导弹制导方面。第二次世界大战后,他对自己的科学工作的用途越来越感到困扰。1946 年,一位军事承包商向他索要一份论文副本,他在一封公开信中划了一条界限。“战争期间和战后政府本身的政策,例如广岛和长崎的轰炸,已经清楚地表明,提供科学信息不一定是无辜的行为。……思想的交流是科学的伟大传统之一,当科学家成为生死的仲裁者时,它当然必须受到一定的限制。我们的军事机构在战争期间采取的措施,限制科学家之间在相关项目甚至同一项目上的自由交流,已经走得太远,很明显,如果在和平时期继续实行这一政策,将导致全面的后果。科学家的不负责任,最终导致科学的死亡。这两者对我们的文明来说都是灾难性的,并且给公众带来严重而直接的危险。” 他拒绝分享他的论文。他知道可以通过其他途径获得这份文件,但他想发表一个声明。
Wiener’s principal mathematical tools were continuous and statistical, rather than discrete and digital. His work on feedback and control was of intense military importance, for example in the guidance of missiles. After the Second World War, he became increasingly troubled by the uses to which his scientific work was being put. In 1946, in an open letter to a military contractor who had asked him for a copy of a paper, he drew a line. “The policy of the government itself during and after the war, say in the bombing of Hiroshima and Nagasaki, has made it clear that to provide scientific information is not a necessarily innocent act. … The interchange of ideas, one of the great traditions of science, must of course receive certain limitations when the scientist becomes an arbiter of life and death. The measures taken during the war by our military agencies, in restricting the free intercourse among scientists on related projects or even on the same project, has gone so far that it is clear that if continued in time of peace this policy will lead to the total irresponsibility of the scientist, and ultimately to the death of science. Both these are disastrous for our civilization, and entail grave and immediate peril for the public.” He refused to share his paper. He knew the paper could be obtained through other means, but he wanted to make a statement.
这封信以“科学家的叛逆”为题发表在《大西洋月刊》上,受到了高度关注(Wiener,1947)。随后,维纳取消了他曾接受的在哈佛大学自动计算大型研讨会上发表演讲的邀请,该研讨会由霍华德·艾肯在海军的支持下组织。维纳原本无意发表进一步的公开声明,但计划已经印制完毕,并在会议上分发,并在维纳的名字上画了一条线。由此产生的公众舆论让所有人都感到非常尴尬,维纳最终与艾肯疏远,并受到参议员约瑟夫·麦卡锡的反共政治迫害的怀疑。但维纳采取了决定性的一步,再也没有接受任何政府资金支持他的研究。一些人赞扬他的道德立场,但随着国防资金促进了战后科学的繁荣,他的科学影响力开始减弱。
The letter, published in the Atlantic under the title “A scientist rebels,” was highly publicized (Wiener, 1947). Wiener then backed out of an invitation he had accepted to speak at a major Harvard symposium on automatic computing organized by Howard Aiken with the support of the Navy. Wiener had not meant thereby to make a further public statement, but the programs had already been printed and were distributed at the conference with a line drawn through Wiener’s name. The resulting publicity was intensely embarrassing to all, and Wiener wound up alienated from Aiken and under the suspicion of Senator Joseph McCarthy’s anti-Communist witch hunt. But Wiener had taken a decisive step, and never again accepted any government money in support of his research. Some celebrated his moral stand, but his scientific influence began to fade as defense funding nourished the postwar scientific boom.
维纳发出的许多关于科学家的道德责任和秘密研究危险的警告在今天引起了共鸣。他将晚年奉献给了控制论的和平利用,正如本文所言,他的警告从军事化的风险转向了技术进步带来的日常生活变化。
Many of the warnings Wiener issued, about the moral responsibilities of scientists and the danger of secret research, resonate today. He devoted his later years to the peaceful uses of cybernetics, and his warnings shifted, as in this paper, from the risks of militarization to the changes in daily life attendant on technological advances.
所有现代计算机和通信技术都应归功于维纳,但他定义的领域随着其各个后代的传播而基本上消失了。“控制论”很少被听到,但“网络”仍然作为“犯罪”、“安全”和其他现象的数字变体的前缀。维纳成为传记作家所说的“黑暗英雄”,他的影响无处不在,但却是无形的(Conway,2005)。
All of modern computer and communications technology owes a scientific debt to Wiener, but the field he defined has largely disappeared as it became diffused through its various descendants. “Cybernetics” is rarely heard, but “cyber” persists as a prefix for “crime,” “security,” and digital variants of other phenomena. Wiener became what a biographer calls a “dark hero,” omnipresent in his influence and yet invisible (Conway, 2005).
他的著作仍然具有先见之明。尽管他拥有惊人的学识和技术魔法,但正如他在《人类对人类的利用》(维纳,1950 年)中所说的那样,他警告我们反对“美国人对技术的崇拜,而不是对技术的崇拜”。
His writings remain premonitory. For all of his stupendous learning and technical wizardry, he cautions us, as he puts in The Human Use of Human Beings (Wiener, 1950), against “the American worship of know-how as opposed to know-what.”
大约13 年前,我出版了一本名为《控制论》的书(Wiener,1948)。在其中我讨论了生物体和机器的控制和通信问题。我对受控机器的发展以及相应的自动化技术做出了相当多的预测,我预见到它们会对未来社会产生重要影响。13 年后的现在,似乎有必要对控制论技术及其社会后果进行评估。
SOME 13 years ago, a book of mine was published by the name of Cybernetics (Wiener, 1948). In it I discussed the problems of control and communication in the living organism and the machine. I made a considerable number of predictions about the development of controlled machines and about the corresponding techniques of automatization, which I foresaw as having important consequences affecting the society of the future. Now, 13 years later, it seems appropriate to take stock of the present position with respect to both cybernetic technique and the social consequences of this technique.
在开始详细讨论这些问题之前,我想提一下普通人对控制论和自动化的某种态度。这种态度需要批判性的讨论,我认为应该完全拒绝。这是机器不能拥有任何程度的原创性的假设。这通常采取这样一种声明的形式:没有放入机器的东西就不可能从机器中出来。这通常被解释为断言人类制造的机器必须持续服从人类,因此它的运作随时可能受到人为干预和政策变化的影响。基于这样的态度,许多人对机器技术的危险嗤之以鼻,并断然否定了塞缪尔·巴特勒早期关于机器可能接管人类的预言。[编者:维纳在这里指的是巴特勒(1863,1872),最初是匿名出版的。]
Before commencing on the detail of these matters, I should like to mention a certain attitude of the man in the street toward cybernetics and automatization. This attitude needs a critical discussion, and in my opinion it should be rejected in its entirety. This is the assumption that machines cannot possess any degree of originality. This frequently takes the form of a statement that nothing can come out of the machine which has not been put into it. This is often interpreted as asserting that a machine which man has made must remain continually subject to man, so that its operation is at any time open to human interference and to a change in policy. On the basis of such an attitude, many people have pooh-poohed the dangers of machine techniques, and they have flatly contradicted the early predictions of Samuel Butler that the machine might take over the control of mankind. [EDITOR: Wiener here refers to Butler (1863, 1872), originally published anonymously.]
确实,在塞缪尔·巴特勒时代,现有的机器远没有今天的机器危险,因为它们只涉及电力,而不涉及一定程度的思维和沟通。然而,当今的机器技术也侵入了后者的领域,因此今天的实际机器与巴特勒所持有的形象有很大不同,我们无法将上一代人看似不言自明的假设转移到这些新设备上。 。我发现自己面对的公众是在对现代机器的结构和操作方式不完全了解的基础上形成对机器的态度的。
It is true that in the time of Samuel Butler the available machines were far less hazardous than machines are today, for they involved only power, not a certain degree of thinking and communication. However, the machine techniques of the present day have invaded the latter fields as well, so that the actual machine of today is very different from the image that Butler held, and we cannot transfer to these new devices the assumptions which seemed axiomatic a generation ago. I find myself facing a public which has formed its attitude toward the machine on the basis of an imperfect understanding of the structure and mode of operation of modern machines.
我的论点是,机器能够而且确实超越了其设计者的一些限制,这样做可能既有效又危险。很可能原则上我们不能制造任何我们迟早无法理解其行为要素的机器。这并不意味着我们能够在比机器操作所需的时间短得多的时间内,甚至在任何给定的年数或代数内理解这些元素。
It is my thesis that machines can and do transcend some of the limitations of their designers, and that in doing so they may be both effective and dangerous. It may well be that in principle we cannot make any machine the elements of whose behavior we cannot comprehend sooner or later. This does not mean in any way that we shall be able to comprehend these elements in substantially less time than the time required for operation of the machine, or even within any given number of years or generations.
现在人们普遍承认,在有限的操作范围内,机器的行动速度比人类快得多,并且在执行操作细节时也更加精确。在这种情况下,即使机器没有以任何方式超越人类的智能,它们也很可能而且经常在执行任务时超越人类。对他们的表现模式的明智理解可能会延迟到他们所设定的任务完成很久之后。
As is now generally admitted, over a limited range of operation, machines act far more rapidly than human beings and are far more precise in performing the details of their operations. This being the case, even when machines do not in any way transcend man’s intelligence, they very well may, and often do, transcend man in the performance of tasks. An intelligent understanding of their mode of performance may be delayed until long after the task which they have been set has been completed.
这意味着,尽管机器在理论上会受到人类的批评,但这种批评可能在其相关很久之后才有效。为了有效地避免灾难性后果,我们对人造机器的理解总体上应该与机器的性能同等发展。由于我们人类行动的缓慢,我们对机器的有效控制可能会失效。当我们能够对感官传达的信息做出反应并停下正在驾驶的汽车时,它可能已经迎面撞上了墙壁。
This means that though machines are theoretically subject to human criticism, such criticism may be ineffective until long after it is relevant. To be effective in warding off disastrous consequences, our understanding of our man-made machines should in general develop pari passu with the performance of the machine. By the very slowness of our human actions, our effective control of our machines may be nullified. By the time we are able to react to information conveyed by our senses and stop the car we are driving, it may already have run head on into a wall.
我将在本文后面回过头来讨论这一点。现在,让我讨论用于一个非常具体目的的机器技术:玩游戏。在这个问题上,我将更具体地讨论跳棋游戏,国际商业机器公司为此开发了非常有效的游戏机。
I shall come back to this point later in this article. For the present, let me discuss the technique of machines for a very specific purpose: that of playing games. In this matter I shall deal more particularly with the game of checkers, for which the International Business Machines Corporation has developed very effective game-playing machines.
让我再说一遍,我们在这里并不关心那些按照其所玩游戏的完美封闭理论运行的机器。冯·诺依曼和摩根斯特恩的博弈论可能对实际游戏机器的操作有启发,但它实际上并没有描述它们。
Let me say once for all that we are not concerned here with the machines which operate on a perfect closed theory of the game they play. The game theory of von Neumann and Morgenstern may be suggestive as to the operation of actual game-playing machines, but it does not actually describe them.
在像西洋跳棋这样复杂的游戏中,如果每个玩家都试图根据对手可以做出的最佳动作、他可以给出的最佳反应、对手可以给出的最佳反应等等来选择自己的玩法,那么他将自己承担了一项不可能完成的任务。这不仅对人类来说是不可能的,而且实际上没有理由认为这是对付他所面对的对手的最佳策略,因为对手的局限性与他自己的局限性相同。
In a game as complicated as checkers, if each player tries to choose his play in view of the best move his opponent can make, against the best response he can give, against the best response his opponent can give, and so on, he will have taken upon himself an impossible task. Not only is this humanly impossible, but there is actually no reason to suppose that it is the best policy against the opponent by whom he is faced, whose limitations are equal to his own.
冯诺依曼游戏理论与游戏机运行的理论没有非常密切的关系。后者更接近于专家但有限的人类国际象棋棋手对抗其他国际象棋棋手所使用的游戏方法。这些参与者依赖于某些战略评估,而这些评估本质上是不完整的。虽然冯诺依曼类型的游戏对于井字棋这样的游戏是有效的,并且具有完整的理论,但国际象棋和西洋跳棋的真正有趣之处在于它们不拥有完整的理论。战争、商业竞争以及我们真正感兴趣的任何其他形式的竞争活动都不会。
The von Neumann theory of games bears no very close relation to the theory by which game-playing machines operate. The latter corresponds much more closely to the methods of play used by expert but limited human chess players against other chess players. Such players depend on certain strategic evaluations, which are in essence not complete. While the von Neumann type of play is valid for games like ticktacktoe, with a complete theory, the very interest of chess and checkers lies in the fact that they do not possess a complete theory. Neither do war, nor business competition, nor any of the other forms of competitive activity in which we are really interested.
在像井字棋这样的游戏中,移动次数很少,每个玩家都能够考虑所有可能性并针对其他玩家的最佳可能移动建立防御,冯·诺依曼类型的完整理论是有效的。在这种情况下,游戏必然以第一玩家获胜、第二玩家获胜或平局结束。
In a game like ticktacktoe, with a small number of moves, where each player is in a position to contemplate all possibilities and to establish a defense against the best possible moves of the other player, a complete theory of the von Neumann type is valid. In such a case, the game must inevitably end in a win for the first player, a win for the second player, or a draw.
我强烈质疑完美游戏的概念在实际的、重要的游戏中是否完全现实。像拿破仑这样的伟大将军和像纳尔逊这样的伟大海军上将采取了不同的方式。他们不仅意识到对手在物资和人员等方面的局限性,而且同样意识到对手在经验和军事知识方面的局限性。纳尔逊通过对大陆强国海军作战经验相对不足与英国舰队高度发达的战术和战略能力的现实评估,才得以展现出将大陆军队赶出海上的勇气。如果他参与了一场漫长的、相对优柔寡断的、可能会失败的冲突,他就不可能做到这一点,而他对敌人的最佳战略的假设注定会导致他的失败。
I question strongly whether this concept of the perfect game is a completely realistic one in the cases of actual, nontrivial games. Great generals like Napoleon and great admirals like Nelson have proceeded in a different manner. They have been aware not only of the limitations of their opponents in such matters as materiel and personnel but equally of their limitations in experience and in military know-how. It was by a realistic appraisal of the relative inexperience in naval operations of the continental powers as compared with the highly developed tactical and strategic competence of the British fleet that Nelson was able to display the boldness which pushed the continental forces off the seas. This he could not have done had he engaged in the long, relatively indecisive, and possibly losing conflict to which his assumption of the best possible strategy on the part of his enemy would have doomed him.
纳尔逊不仅评估敌人的物资和人员,还评估他们的判断力以及战术和战略技巧的多少,纳尔逊根据他们以前的战斗记录采取行动。同样,拿破仑在意大利与奥地利人作战的一个重要因素是他对维尔姆瑟的僵化和精神局限性的了解。
In assessing not merely the materiel and personnel of his enemies but also the degree of judgment and the amount of skill in tactics and strategy to be expected of them, Nelson acted on the basis of their record in previous combats. Similarly, an important factor in Napoleon’s conduct of his combat with the Austrians in Italy was his knowledge of the rigidity and mental limitations of Würmser.
在任何现实的游戏理论中,这种体验元素都应该得到充分的认可。对于一名国际象棋棋手来说,下棋是完全合法的,不是与一个理想的、不存在的、完美的对手下棋,而是与他能够从记录中确定其习惯的对手下棋。因此,在博弈论中,至少必须做出两种不同的智力努力。其一是短期努力,以个人游戏的确定策略进行比赛。另一个是检查多场比赛的记录。这个纪录有球员本人创造的,也有他对手创造的,甚至还有没有和他亲自交手过的球员创造的。根据这一记录,他确定了过去所证明的不同政策的相对优势。
This element of experience should receive adequate recognition in any realistic theory of games. It is quite legitimate for a chess player to play, not against an ideal, nonexisting, perfect antagonist, but rather against one whose habits he has been able to determine from the record. Thus, in the theory of games, at least two different intellectual efforts must be made. One is the short-term effort of playing with a determined policy for the individual game. The other is the examination of a record of many games. This record has been set by the player himself, by his opponent, or even by players with whom he has not personally played. In terms of this record, he determines the relative advantages of different policies as proved over the past.
国际象棋比赛甚至还需要第三阶段的判断。这至少部分地通过重要过去的长度来表达。国际象棋理论的发展降低了在艺术的不同阶段进行的游戏的重要性。另一方面,精明的国际象棋理论家可能会提前估计当前流行的某种政策已经变得没有什么价值,最好回到早期的下棋模式,以预测他所研究的人的政策变化。很可能会找到作为他的对手。
There is even a third stage of judgment required in a chess game. This is expressed at least in part by the length of the significant past. The development of theory in chess decreases the importance of games played at a different stage of the art. On the other hand, an astute chess theoretician may estimate in advance that a certain policy currently in fashion has become of little value, and that it may be best to return to earlier modes of play to anticipate the change in policy of the people whom he is likely to find as his opponents.
因此,在确定国际象棋策略时,有几个不同层次的考虑,它们在某种程度上对应于伯特兰·罗素的不同逻辑类型。有战术的层次、战略的层次、在确定战略时应权衡的一般考虑的层次、相关过去的长度——这些考虑可能有效的过去——的层次。考虑在内,等等。每一个新的关卡都需要研究比前一个关卡更广阔的过去。
Thus, in determining policy in chess there are several different levels of consideration which correspond in a certain way to the different logical types of Bertrand Russell. There is the level of tactics, the level of strategy, the level of the general considerations which should have been weighed in determining this strategy, the level in which the length of the relevant past—the past within which these considerations may be valid—is taken into account, and so on. Each new level demands a study of a much larger past than the previous one.
我将这些层次与罗素关于类、类的类、类的类的类等等的逻辑类型进行了比较。值得注意的是,罗素并不认为涉及所有类型的陈述都是重要的。他指出诸如理发师为所有人刮胡子、并且只给那些不给自己刮胡子的人刮胡子等问题是徒劳的。他自己刮胡子吗?对于一种类型,他会这样做,对于下一种类型,他不会,等等,无限期地。所有这些涉及无限类型的问题都可能导致无法解决的悖论。同样,在各种复杂程度下寻找最佳政策也是徒劳的,只会导致混乱。
I have compared these levels with the logical types of Russell concerning classes, classes of classes, classes of classes of classes, and so on. It may be noted that Russell does not consider statements involving all types as significant. He brings out the futility of such questions as that concerning the barber who shaves all persons, and only those persons, who do not shave themselves. Does he shave himself? On one type he does, on the next type he does not, and so on, indefinitely. All such questions involving an infinity of types may lead to unsolvable paradoxes. Similarly, the search for the best policy under all levels of sophistication is a futile one and must lead to nothing but confusion.
这些考虑因素出现在机器决策和人决策中。这些都是编程中出现的问题。最低类型的游戏机按照某种严格的游戏评估进行游戏。得失棋子的价值、棋子的指挥、移动性等数量,可以在一定的经验基础上赋予数值权重,并在此基础上对接下来的每场比赛给予符合以下规则的权重:游戏规则。可以选择权重最大的游戏。在这种情况下,机器的下棋对于它的对手来说——他情不自禁地评估了机器的国际象棋个性——似乎是一种僵化的。
These considerations arise in the determination of policy by machines as well as in the determination of policy by persons. These are the questions which arise in the programming of programming. The lowest type of game-playing machine plays in terms of a certain rigid evaluation of plays. Quantities such as the value of pieces gained or lost, the command of the pieces, their mobility, and so on, can be given numerical weights on a certain empirical basis, and a weighting may be given on this basis to each next play conforming to the rules of the game. The play with the greatest weight may be chosen. Under these circumstances, the play of the machine will seem to its antagonist—who cannot help but evaluate the chess personality of the machine—a rigid one.
下一步是让机器不仅考虑单场比赛中发生的动作,还要考虑之前玩过的比赛的记录。在此基础上,机器可能会时不时地停下来,不是为了玩,而是为了考虑它所考虑的因素的什么(线性或非线性)权重最适合获胜的比赛,而不是输掉(或平局)的比赛。在此基础上,继续发挥新的权重。在人类对手看来,这样的机器的游戏个性远没有那么严格,早期能够击败它的技巧现在可能无法欺骗它。
The next step is for the machine to take into consideration not merely the moves as they occurred in the individual game but the record of games previously played. On this basis, the machine may stop from time to time, not to play but to consider what (linear or nonlinear) weighting of the factors which it has been given to consider would correspond best to won games as opposed to lost (or drawn) games. On this basis, it continues to play with a new weighting. Such a machine would seem to its human opponent to have a far less rigid game personality, and tricks which would defeat it at an earlier stage may now fail to deceive it.
这些学习机器目前的水平是,它们在国际象棋中可以玩相当的业余游戏,但在跳棋中,它们可以在 10 到 20 个小时的工作和灌输后显示出比编程它们的玩家明显的优势。因此,它们绝对摆脱了制造它们的人的完全有效的控制。尽管他们能够考虑的因素可能是严格的,但毫无疑问,他们确实表现出了独创性——正如那些与他们一起玩过的人所说的那样——不仅在他们的战术上(这可能是完全不可预见的),而且甚至在他们的战术上也表现出了独创性。他们的策略的详细权重。
The present level of these learning machines is that they play a fair amateur game at chess but that in checkers they can show a marked superiority to the player who has programmed them after from 10 to 20 playing hours of working and indoctrination. They thus most definitely escape from the completely effective control of the man who has made them. Rigid as the repertory of factors may be which they are in a position to take into consideration, they do unquestionably—and so say those who have played with them—show originality, not merely in their tactics, which may be quite unforeseen, but even in the detailed weighting of their strategy.
正如我所说,具有学习能力的跳棋机器已经发展到可以击败程序员的地步。然而,他们似乎仍然有一个弱点。这取决于最终的游戏。在这里,机器在确定给予致命一击的最佳方式方面有些笨拙。这是因为现有的机器大部分都采用了在游戏的每个阶段执行相同策略的程序。鉴于棋子值与跳棋的相似性,这对于游戏的大部分来说是很自然的,但当棋盘相对空且主要问题是移动到位置而不是直接攻击时,就不再完全相关了。在我所描述的方法框架内,很有可能进行第二次探索,以确定在对手的棋子数量减少到这些新的考虑因素变得至关重要之后应该采取什么策略。
As I have said, checker-playing machines which learn have developed to the point at which they can defeat the programmer. However, they appear still to have one weakness. This lies in the end game. Here the machines are somewhat clumsy in determining the best way to give the coup de grâce. This is due to the fact that the existing machines have for the most part adopted a program in which the identical strategy is carried out at each stage of the game. In view of the similarity of values of pieces to checkers this is quite natural for a large part of the play but ceases to be perfectly relevant when the board is relatively empty and the main problem is that of moving into position rather than that of direct attack. Within the frame of the methods I have described it is quite possible to have a second exploration to determine what the policy should be after the number of pieces of the opponent is so reduced that these new considerations become paramount.
到目前为止,国际象棋机器还没有达到跳棋机器的完美程度,尽管正如我所说,它们肯定可以玩令人尊敬的业余游戏。其原因可能与他们在西洋跳棋最终游戏中相对效率的原因类似。在国际象棋中,不仅最后局的正确策略与中间局有很大不同,而且开局也是如此。在这方面,跳棋和国际象棋之间的区别在于,跳棋中棋子的初始下棋在性质上与游戏中期出现的棋子没有太大区别,而在国际象棋中,棋子在开始时的排列非常低。机动性,使得从这个位置部署他们的问题就显得尤为困难。这就是为什么开局和发展形成国际象棋理论的一个特殊分支的原因。
Chess-playing machines have not, so far, been brought to the degree of perfection of checker-playing machines, although, as I have said, they can most certainly play a respectable amateur game. Probably the reason for this is similar to the reason for their relative efficiency in the end game of checkers. In chess, not only is the end game quite different in its proper strategy from the mid-game but the opening game is also. The difference between checkers and chess in this respect is that the initial play of the pieces in checkers is not very different in character from the play which arises in the mid-game, while in chess, pieces at the beginning have an arrangement of exceptionally low mobility, so that the problem of deploying them from this position is particularly difficult. This is the reason why opening play and development form a special branch of chess theory.
机器可以通过多种方式了解这些众所周知的事实并探索单独的等待策略。这并不意味着我在这里讨论的博弈论类型不适用于国际象棋,而只是意味着在我们制造出能够下国际象棋大师的机器之前,它需要更多的考虑。我的一些从事这些问题的朋友认为,这个目标将在10到25年内实现。我不是国际象棋专家,因此不敢主动做出任何此类预测。
There are various ways in which the machine can take cognizance of these well-known facts and explore a separate waiting strategy for the opening. This does not mean that the type of game theory which I have here discussed is not applicable to chess but merely that it requires much more consideration before we can make a machine that can play master chess. Some of my friends who are engaged in these problems believe that this goal will be achieved in from 10 to 25 years. Not being a chess expert, I do not venture to make any such predictions on my own initiative.
在新的按钮战争中,学习机器很可能会被用来对按钮的按下进行编程。在这里,我们正在考虑一个非学习角色的自动机可能已经在使用的领域。根据真实战争的实际经验来对这些机器进行编程是完全不可能的。一方面,如果有足够的经验来进行适当的编程,那么人类可能已经被消灭了。
It is quite in the cards that learning machines will be used to program the pushing of the button in a new push-button war. Here we are considering a field in which automata of a non-learning character are probably already in use. It is quite out of the question to program these machines on the basis of an actual experience in real war. For one thing a sufficient experience to give an adequate programming would probably see humanity already wiped out.
此外,按钮式战争的技术必然会发生很大的变化,以至于当积累了足够的经验时,开始的基础就会发生根本性的变化。因此,这种学习机器的编程必须基于某种战争游戏,就像指挥官和参谋人员现在以类似的方式学习战略艺术的重要部分一样。然而,如果战争游戏中的胜利规则与我们对国家的实际愿望不相符,那么这样的机器很可能会制定出一种政策,以牺牲分数为代价赢得名义上的胜利。我们心中的每一个利益,甚至是国家的生存。
Moreover, the techniques of push-button war are bound to change so much that by the time an adequate experience could have been accumulated, the basis of the beginning would have radically changed. Therefore, the programming of such a learning machine would have to be based on some sort of war game, just as commanders and staff officials now learn an important part of the art of strategy in a similar manner. Here, however, if the rules for victory in a war game do not correspond to what we actually wish for our country, it is more than likely that such a machine may produce a policy which would win a nominal victory on points at the cost of every interest we have at heart, even that of national survival.
我们在这里面临的这个问题,这是一个道德问题,与奴隶制的重大问题之一非常接近。让我们承认奴隶制是不好的,因为它是残酷的。然而,它是自相矛盾的,而且其原因也完全不同。我们希望奴隶聪明,能够协助我们执行任务。不过,我们也希望他能够顺从。完全的服从和完全的智慧是不能并存的。在古代,有多少次聪明的希腊哲学家的奴隶,是一个不太聪明的罗马奴隶主的奴隶,一定会支配他主人的行动,而不是服从他的意愿!同样,如果机器变得越来越高效,并且在越来越高的心理层面上运行,那么巴特勒所预见的机器统治的灾难就会越来越近。
The problem, and it is a moral problem, with which we are here faced is very close to one of the great problems of slavery. Let us grant that slavery is bad because it is cruel. It is, however, self-contradictory, and for a reason which is quite different. We wish a slave to be intelligent, to be able to assist us in the carrying out of our tasks. However, we also wish him to be subservient. Complete subservience and complete intelligence do not go together. How often in ancient times the clever Greek philosopher slave of a less intelligent Roman slaveholder must have dominated the actions of his master rather than obeyed his wishes! Similarly, if the machines become more and more efficient and operate at a higher and higher psychological level, the catastrophe foreseen by Butler of the dominance of the machine comes nearer and nearer.
当我们进入更高的逻辑领域时,人脑是比智能机器更有效的控制装置。它是一个自组织系统,依赖于将自身改造为新机器的能力,而不是依赖于解决问题的绝对准确性和速度。我们已经通过严格的政策制造了非常成功的最低逻辑类型的机器。我们开始制造第二种逻辑类型的机器,其中策略本身随着学习而改进。在操作机器的构造中,对于逻辑类型没有具体的可预见的限制,也不能安全地宣布大脑优于机器的确切水平。然而,至少在很长一段时间内,大脑总是会在某个水平上优于人造机器,尽管这个水平可能会不断向上移动。
The human brain is a far more efficient control apparatus than is the intelligent machine when we come to the higher areas of logic. It is a self-organizing system which depends on its capacity to modify itself into a new machine rather than on ironclad accuracy and speed in problem-solving. We have already made very successful machines of the lowest logical type, with a rigid policy. We are beginning to make machines of the second logical type, where the policy itself improves with learning. In the construction of operative machines, there is no specific foreseeable limit with respect to logical type, nor is it safe to make a pronouncement about the exact level at which the brain is superior to the machine. Yet for a long time at least there will always be some level at which the brain is better than the constructed machine, even though this level may shift upwards and upwards.
可以看出,自动化编程技术的结果是使设计者和操作员无法有效理解机器得出结论的许多阶段以及许多人的真正战术意图。它的操作可能是。这与我们能够预见不希望发生的情况的问题高度相关当机器仍在运行时,而我们的干预可能会阻止这些后果的发生,那么就会产生超出游戏策略框架的后果。
It may be seen that the result of a programming technique of automatization is to remove from the mind of the designer and operator an effective understanding of many of the stages by which the machine comes to its conclusions and of what the real tactical intentions of many of its operations may be. This is highly relevant to the problem of our being able to foresee undesired consequences outside the frame of the strategy of the game while the machine is still in action and while intervention on our part may prevent the occurrence of these consequences.
这里有必要认识到人类的行为是一种反馈行为。为了避免灾难性的后果,我们的某些行动足以改变机器的进程是不够的,因为我们很可能缺乏考虑这种行动的信息。
Here it is necessary to realize that human action is a feedback action. To avoid a disastrous consequence, it is not enough that some action on our part should be sufficient to change the course of the machine, because it is quite possible that we lack information on which to base consideration of such an action.
用神经生理学的语言来说,共济失调与瘫痪一样是一种剥夺。患有运动性共济失调的患者可能没有任何肌肉或运动神经缺陷,但如果他的肌肉、肌腱和器官不能准确地告诉他他处于什么位置,以及他的器官所承受的张力是否会或将会不导致他跌倒,他就无法站起来。同样,当我们建造的机器能够以我们无法跟上的速度对其输入数据进行操作时,我们可能不知道何时将其关闭,直到为时已晚。我们都知道魔法师学徒的寓言,在这个故事中,男孩在师傅不在的时候让扫帚挑水,以至于当师傅再次出现时,他差点被淹死。如果这个男孩必须在他主人的图书馆的魔法书中寻找咒语来阻止恶作剧,他可能在发现相关咒语之前就被淹死了。同样,如果一家瓶子工厂是在最大生产力的基础上进行规划的,那么在老板得知自己应该提前六个月停止生产之前,他可能会因为所生产的滞销瓶子的大量库存而破产。
In neurophysiological language, ataxia can be quite as much of a deprivation as paralysis. A patient with locomotor ataxia may not suffer from any defect of his muscles or motor nerves, but if his muscles and tendons and organs do not tell him exactly what position he is in, and whether the tensions to which his organs are subjected will or will not lead to his falling, he will be unable to stand up. Similarly, when a machine constructed by us is capable of operating on its incoming data at a pace which we cannot keep, we may not know, until too late, when to turn it off. We all know the fable of the sorcerer’s apprentice, in which the boy makes the broom carry water in his master’s absence, so that it is on the point of drowning him when his master reappears. If the boy had had to seek a charm to stop the mischief in the grimoires of his master’s library, he might have been drowned before he had discovered the relevant incantation. Similarly, if a bottle factory is programmed on the basis of maximum productivity, the owner may be made bankrupt by the enormous inventory of unsalable bottles manufactured before he learns he should have stopped production six months earlier.
《魔法师的学徒》只是许多基于魔法机构是字面意思的假设的故事之一。《天方夜谭》里有精灵与渔夫的故事,渔夫解开了所罗门囚禁精灵的封印,发现精灵发誓要自取灭亡。WW·雅各布斯(WW Jacobs)有一个关于“猴爪”的故事,其中军士长从印度带回了一个护身符,它可以实现三个人每人三个愿望。关于这个护身符的第一个接受者,我们只被告知他的第三个愿望是死亡。军士长是第二个愿望得到满足的人,他发现自己的经历太可怕了,无法讲述。他的朋友收到了护身符,首先希望得到200英镑。不久之后,他儿子工作的工厂的一名官员来告诉他,他的儿子在机器中丧生,并且在没有承认责任的情况下,公司寄给他 200英镑作为安慰。他的下一个愿望是他的儿子应该回来,这时鬼魂敲门了。他的第三个愿望是鬼魂消失。
The “Sorcerer’s Apprentice” is only one of many tales based on the assumption that the agencies of magic are literal-minded. There is the story of the genie and the fisherman in the Arabian Nights, in which the fisherman breaks the seal of Solomon which has imprisoned the genie and finds the genie vowed to his own destruction; there is the tale of the “Monkey’s Paw,” by W. W. Jacobs, in which the sergeant major brings back from India a talisman which has the power to grant each of three people three wishes. Of the first recipient of this talisman we are told only that his third wish is for death. The sergeant major, the second person whose wishes are granted, finds his experiences too terrible to relate. His friend, who receives the talisman, wishes first for £200. Shortly thereafter, an official of the factory in which his son works comes to tell him that his son has been killed in the machinery and that, without any admission of responsibility, the company is sending him as consolation the sum of £200. His next wish is that his son should come back, and the ghost knocks at the door. His third wish is that the ghost should go away.
不仅在童话世界中,而且在现实世界中,只要两个本质上互不相干的机构联合起来试图实现一个共同的目标,就会出现灾难性的结果。如果这两个机构之间关于这一目的的性质的沟通不完整,那么只能预期这种合作的结果将不会令人满意。如果我们使用一个机械机构来实现我们的目的,一旦我们开始它的操作,我们就无法有效地干预它的操作,因为该操作是如此之快且不可撤销,以至于我们在操作完成之前没有数据进行干预,那么我们就最好保持安静确保机器的用途是我们真正想要的用途,而不仅仅是对它的彩色模仿。
Disastrous results are to be expected not merely in the world of fairy tales but in the real world wherever two agencies essentially foreign to each other are coupled in the attempt to achieve a common purpose. If the communication between these two agencies as to the nature of this purpose is incomplete, it must only be expected that the results of this cooperation will be unsatisfactory. If we use, to achieve our purposes, a mechanical agency with whose operation we cannot efficiently interfere once we have started it, because the action is so fast and irrevocable that we have not the data to intervene before the action is complete, then we had better be quite sure that the purpose put into the machine is the purpose which we really desire and not merely a colorful imitation of it.
到目前为止,我一直在考虑机器和人类在联合企业中同时行动所引起的准道德问题。我们已经看到,学习机的使用中造成灾难性后果的危险的主要原因之一是人和机器在两个不同的时间尺度上运行,因此机器比人快得多,并且两者不配合没有严重的困难。当时间尺度截然不同的两个控制算子一起作用时,无论哪个系统更快、哪个系统更慢,都会出现同类问题。这给我们留下了更直接的道德问题:当人作为个体与较慢的时间尺度的受控过程(例如政治历史的一部分,或者我们的主要探究主题)联系在一起时,道德问题是什么?科学的发展?
Up to this point I have been considering the quasi-moral problems caused by the simultaneous action of the machine and the human being in a joint enterprise. We have seen that one of the chief causes of the danger of disastrous consequences in the use of the learning machine is that man and machine operate on two distinct time scales, so that the machine is much faster than man and the two do not gear together without serious difficulties. Problems of the same sort arise whenever two control operators on very different time scales act together, irrespective of which system is the faster and which system is the slower. This leaves us the much more directly moral question: What are the moral problems when man as an individual operates in connection with the controlled process of a much slower time scale, such as a portion of political history or—our main subject of inquiry—the development of science?
需要注意的是,科学的发展是对物质的长期认识和控制的控制和交流过程。在这个过程中,50年对于一个人的生命来说就像一天。出于这个原因,个体科学家必须作为一个过程的一部分来工作,这个过程的时间尺度是如此之长,以至于他自己只能思考其中非常有限的部分。在这里,双机的两个部分之间的通信也是困难且有限的。即使当个人相信科学有助于实现他心中的人类目标时,他的信念也需要不断地审视和重新评估,而这只是部分可能的。对于个别科学家来说,即使是对人与过程之间这种联系的部分评估,也需要对历史进行富有想象力的前瞻性审视,这是困难的、严格的,而且只能有限地实现。如果我们简单地坚持科学家的信条,即对世界和我们自己的不完整的了解比没有知识要好,我们仍然不能总是证明这样一个天真的假设是正确的:我们越快地运用新的力量向我们开放的行动就越好。我们必须始终充分发挥我们的想象力来研究充分利用我们的新模式可能会引导我们走向何方。
Let it be noted that the development of science is a control and communication process for the long-term understanding and control of matter. In this process 50 years are as a day in the life of the individual. For this reason, the individual scientist must work as a part of a process whose time scale is so long that he himself can only contemplate a very limited sector of it. Here, too, communication between the two parts of a double machine is difficult and limited. Even when the individual believes that science contributes to the human ends which he has at heart, his belief needs a continual scanning and re-evaluation which is only partly possible. For the individual scientist, even the partial appraisal of this liaison between the man and the process requires an imaginative forward glance at history which is difficult, exacting, and only limitedly achievable. And if we adhere simply to the creed of the scientist, that an incomplete knowledge of the world and of ourselves is better than no knowledge, we can still by no means always justify the naïve assumption that the faster we rush ahead to employ the new powers for action which are opened up to us, the better it will be. We must always exert the full strength of our imagination to examine where the full use of our new modalities may lead us.
经美国科学促进会许可,转载自 Wiener (1960)。
Reprinted from Wiener (1960), with permission from the American Association for the Advancement of Science.
JCR Licklider(1915-1990)与本书中的大多数人物不同。他既不是工程师,也不是数学家,也不是哲学家。然而,通过 20 世纪中叶的几个巧合,他的愿景帮助创造了我们现在生活的世界。
J. C. R. Licklider (1915–1990) is different from most of the characters in this book. He was neither an engineer, nor a mathematician, nor a philosopher. Yet through a happy coincidence of several mid-20th-century circumstances, his vision helped create the world in which we now live.
利克莱德总是只用他名字的首字母来称呼,因为他的昵称就是“利克”。他接受过心理学家培训,在听觉感知方面做了重要的工作,但在冷战期间在麻省理工学院从事防空系统设计工作时,他开始思考我们现在所说的计算中的“人为因素”。电子计算机的人类操作员的任务是在面对来袭导弹和其他军事情报的数据时做出具有重大全球影响的瞬间决策。计算机本身如何帮助人类做出更好的决策?今天的视频游戏、虚拟现实系统和互联网猫视频都是出于对管理核灾难风险的担忧而诞生的。
Licklider is invariably referred to only by the initials of his given names, because his nickname was just “Lick.” He was trained as a psychologist and did important work on auditory perception, but began to think about what we would now call “human factors” in computing while working at MIT during the Cold War on the design of air defense systems. A human operator of an electronic computer was tasked with making split-second decisions of vast global consequence when confronted with data about incoming missiles and other military intelligence. How might the computer itself help the human make better decisions? Today’s video games, virtual reality systems, and internet cat videos were born of such concerns about managing the risk of nuclear cataclysm.
当心理学研究中心搬到哈佛大学后,利克莱德离开麻省理工学院,加入了剑桥公司 Bolt Beranek and Newman (BBN),该公司以其声学专业知识而闻名,但最终获得了为阿帕网(阿帕网的先驱)建造最早的网关的合同。互联网。在那里,利克莱德撰写了此处包含的有影响力的论文。这要归功于万尼瓦尔·布什的远见,而且利克里德肯定是在麻省理工学院认识布什的。但到它编写时,一些计算机已经能够进行编程来执行交互式和动态图形。1962 年,利克莱德从 BBN 转到国防部高级研究计划局 (ARPA),在那里他帮助创建了阿帕网 (ARPANET)。
When the center of psychology research moved to Harvard, Licklider left MIT for the Cambridge firm Bolt Beranek and Newman (BBN), which was famous for its acoustic expertise but would eventually receive contracts to build the earliest gateways for the ARPANET, the forerunner of the internet. While there, Licklider wrote the influential paper included here. It owes something to Vannevar Bush’s vision, and to be sure Licklider knew Bush from MIT. But by the time it was written some computers were capable of being programmed to carry out interactive and dynamic graphics. From BBN, Licklider moved in 1962 to the Advanced Research Projects Agency (ARPA) of the Department of Defense, where he helped give birth to the ARPANET.
利克莱德离开政府并在 IBM 短暂任职,他发现这家公司缺乏远见。1966 年,他回到麻省理工学院,并在那里度过了余下职业生涯的大部分时间,其中有短暂的第二次在国防部任职,负责指导其高级研究项目。他是一个有趣、善于表达、谦逊的人。我是在哈佛大学 DEC PDP-1 项目上读本科时认识他的,当时我帮助他调试了他正在编写的一些程序。我完全没有意识到他的远见使我正在做的工作成为可能——我以为他可能是一名退休的研究生——而且他也没有试图向我提供线索。
Licklider left the government for a short stint at IBM, which he found insufficiently visionary. He returned to MIT in 1966 and remained there for most of the rest of his career, with a short second tour of duty at DoD steering its advanced research projects. He was a funny, articulate, and humble man. I met him when I, working as an undergraduate on Harvard’s DEC PDP-1, helped him debug some program he was writing. I was utterly unaware that he was the man whose vision had made possible the very work I was doing—I thought he might be a superannuated graduate student—and he made no attempt to clue me in.
人机共生是人与电子计算机之间合作交互的预期发展。它将涉及伙伴关系中的人类和电子成员之间非常紧密的耦合。主要目标是:1)让计算机促进公式化思维,因为它们现在可以促进公式化问题的解决;2)使人和计算机能够合作做出决策和控制复杂的情况,而不是僵化地依赖预定程序。在预期的共生伙伴关系中,男人将设定目标、提出假设、确定标准并进行评估。计算机将完成必须完成的常规工作,为技术和科学思维中的见解和决策铺平道路。初步分析表明,共生伙伴关系将比人类单独执行更有效地执行智力操作。实现有效的合作关联的先决条件包括计算机时间共享、存储器组件、存储器组织、编程语言以及输入和输出设备的发展。
MAN-COMPUTER symbiosis is an expected development in cooperative interaction between men and electronic computers. It will involve very close coupling between the human and the electronic members of the partnership. The main aims are 1) to let computers facilitate formulative thinking as they now facilitate the solution of formulated problems, and 2) to enable men and computers to cooperate in making decisions and controlling complex situations without inflexible dependence on predetermined programs. In the anticipated symbiotic partnership, men will set the goals, formulate the hypotheses, determine the criteria, and perform the evaluations. Computing machines will do the routinizable work that must be done to prepare the way for insights and decisions in technical and scientific thinking. Preliminary analyses indicate that the symbiotic partnership will perform intellectual operations much more effectively than man alone can perform them. Prerequisites for the achievement of the effective, cooperative association include developments in computer time sharing, in memory components, in memory organization, in programming languages, and in input and output equipment.
“人机共生”是人机系统的一个子类。有许多人机系统。然而,目前还没有人机共生。本文的目的是提出这一概念,并希望通过分析人与计算机之间交互的一些问题,引起人们对人机工程应用原理的关注,并指出人机共生的发展,以促进人机共生的发展。需要研究回答的问题很少。希望在不久的将来,人脑和计算机器将非常紧密地结合在一起,并且由此产生的伙伴关系将以人脑从未有过的方式进行思考,并以信息处理无法达到的方式处理数据。我们今天所知道的机器。
“Man–computer symbiosis” is a subclass of man–machine systems. There are many man–machine systems. At present, however, there are no man–computer symbioses. The purposes of this paper are to present the concept and, hopefully, to foster the development of man–computer symbiosis by analyzing some problems of interaction between men and computing machines, calling attention to applicable principles of man–machine engineering, and pointing out a few questions to which research answers are needed. The hope is that, in not too many years, human brains and computing machines will be coupled together very tightly, and that the resulting partnership will think as no human brain has ever thought and process data in a way not approached by the information-handling machines we know today.
当然,从某种意义上说,任何人造系统都是为了帮助人类,帮助系统之外的一个或多个人。然而,如果我们关注系统内的人类操作员,我们就会发现,在某些技术领域,过去几年发生了巨大的变化。“机械延伸”已经让位给人类的替代和自动化,而留下来的人更多的是提供帮助而不是被帮助。在某些情况下,特别是在以计算机为中心的大型信息和控制系统中,人类操作员主要负责那些被证明无法实现自动化的功能。这样的系统(诺斯可能称之为“人类延伸的机器”)不是共生系统。它们是“半自动”系统,一开始是全自动的,但没有达到目标。
In one sense of course, any man-made system is intended to help man, to help a man or men outside the system. If we focus upon the human operator within the system, however, we see that, in some areas of technology, a fantastic change has taken place during the last few years. “Mechanical extension” has given way to replacement of men, to automation, and the men who remain are there more to help than to be helped. In some instances, particularly in large computer-centered information and control systems, the human operators are responsible mainly for functions that it proved infeasible to automate. Such systems (“humanly extended machines,” North might call them) are not symbiotic systems. They are “semi-automatic” systems, systems that started out to be fully automatic but fell short of the goal.
人机共生可能不是复杂技术系统的最终范例。似乎完全有可能,在适当的时候,电子或化学“机器”将在我们现在认为完全属于人类大脑范围的大多数功能上超越人类大脑。即使是现在,Gelernter 用于证明平面几何定理的 IBM 704 程序的进度也与布鲁克林高中生的进度大致相同,并且犯了类似的错误(Gelernter,1959)。事实上,有一些定理证明、问题解决、下棋和模式识别程序,太多了,无法完整参考(Bernstein 和 Roberts,1958 年;Bledsoe 和 Browning,1959 年;Dinneen,1955 年;Farley 和 Clark) ,1954;Friedberg,1958;Gilmore 和 Savell,1959;Newell,1955;Newell 和 Shaw,1957;Newell 等,1958;Selfridge,1958;Shannon,1950;Sherman,1959)——能够在以下方面与人类智力表现相媲美:限制区域;纽厄尔、西蒙和肖的“通用问题解决者”可能会消除一些限制。简而言之,似乎值得通过将遥远的未来大脑的主导地位让给机器来避免与(其他)人工智能爱好者争论。尽管如此,仍将有一个相当长的过渡时期,在此期间,主要的智力进步将由人类和计算机密切合作来实现。一个多学科研究小组研究了空军未来的研发问题,估计要到 1980 年,人工智能的发展才能使机器独立思考或解决具有军事意义的问题。比如说,这意味着需要 5 年的时间来开发人机共生,并用 15 年的时间来使用它。15年可能是10年或500年,但那些年应该是人类历史上智力上最具创造力和最令人兴奋的年份。
Man–computer symbiosis is probably not the ultimate paradigm for complex technological systems. It seems entirely possible that, in due course, electronic or chemical “machines” will outdo the human brain in most of the functions we now consider exclusively within its province. Even now, Gelernter’s IBM 704 program for proving theorems in plane geometry proceeds at about the same pace as Brooklyn high school students, and makes similar errors (Gelernter, 1959). There are, in fact, several theorem-proving, problem-solving, chess-playing, and pattern-recognizing programs—too many for complete reference (Bernstein and Roberts, 1958; Bledsoe and Browning, 1959; Dinneen, 1955; Farley and Clark, 1954; Friedberg, 1958; Gilmore and Savell, 1959; Newell, 1955; Newell and Shaw, 1957; Newell et al., 1958; Selfridge, 1958; Shannon, 1950; Sherman, 1959)—capable of rivaling human intellectual performance in restricted areas; and Newell, Simon, and Shaw’s “general problem solver” may remove some of the restrictions. In short, it seems worthwhile to avoid argument with (other) enthusiasts for artificial intelligence by conceding dominance in the distant future of cerebration to machines alone. There will nevertheless be a fairly long interim during which the main intellectual advances will be made by men and computers working together in intimate association. A multidisciplinary study group, examining future research and development problems of the Air Force, estimated that it would be 1980 before developments in artificial intelligence make it possible for machines alone to do much thinking or problem solving of military significance. That would leave, say, five years to develop man–computer symbiosis and 15 years to use it. The 15 may be 10 or 500, but those years should be intellectually the most creative and exciting in the history of mankind.
当今的计算机主要设计用于解决预先制定的问题或根据预定程序处理数据。计算过程可能取决于计算期间获得的结果,但必须提前预见所有替代方案。(如果出现不可预见的替代方案,整个过程就会停止并等待计划的必要扩展。)预先制定或预先确定的要求有时并不是很大的缺点。人们常说,计算机编程迫使人们清晰地思考,它规范了思维过程。如果用户能够提前思考他的问题,则不需要与计算机共生关联。
Present-day computers are designed primarily to solve preformulated problems or to process data according to predetermined procedures. The course of the computation may be conditional upon results obtained during the computation, but all the alternatives must be foreseen in advance. (If an unforeseen alternative arises, the whole process comes to a halt and awaits the necessary extension of the program.) The requirement for preformulation or predetermination is sometimes no great disadvantage. It is often said that programming for a computing machine forces one to think clearly, that it disciplines the thought process. If the user can think his problem through in advance, symbiotic association with a computing machine is not necessary.
人机共生是人与电子计算机之间合作交互的预期发展。它将涉及伙伴关系中的人类和电子成员之间非常紧密的耦合。主要目标是:1)让计算机促进公式化思维,因为它们现在可以促进公式化问题的解决;2)使人和计算机能够合作做出决策和控制复杂的情况,而不是僵化地依赖预定程序。在预期的共生伙伴关系中,男人将设定目标、提出假设、确定标准并进行评估。计算机将完成必须完成的常规工作,为技术和科学思维中的见解和决策铺平道路。初步分析表明,共生伙伴关系将比人类单独执行更有效地执行智力操作。实现有效的合作关联的先决条件包括计算机时间共享、存储器组件、存储器组织、编程语言以及输入和输出设备的发展。
Man–computer symbiosis is an expected development in cooperative interaction between men and electronic computers. It will involve very close coupling between the human and the electronic members of the partnership. The main aims are 1) to let computers facilitate formulative thinking as they now facilitate the solution of formulated problems, and 2) to enable men and computers to cooperate in making decisions and controlling complex situations without inflexible dependence on predetermined programs. In the anticipated symbiotic partnership, men will set the goals, formulate the hypotheses, determine the criteria, and perform the evaluations. Computing machines will do the routinizable work that must be done to prepare the way for insights and decisions in technical and scientific thinking. Preliminary analyses indicate that the symbiotic partnership will perform intellectual operations much more effectively than man alone can perform them. Prerequisites for the achievement of the effective, cooperative association include developments in computer time sharing, in memory components, in memory organization, in programming languages, and in input and output equipment.
前面的段落默认了这样的假设:如果能够有效地将它们引入思维过程,那么数据处理机器可以执行的功能将以重要的方式改善或促进思维和问题解决。这一假设可能需要论证。
The preceding paragraphs tacitly made the assumption that, if they could be introduced effectively into the thought process, the functions that can be performed by data-processing machines would improve or facilitate thinking and problem solving in an important way. That assumption may require justification.
很快我就发现,我所做的主要事情就是保存记录,如果按照最初计划中设想的细节来保存记录,那么这个项目就会变成无限倒退。不是。尽管如此,我获得了一张我的活动照片,这让我停了下来。也许我的光谱并不典型——我希望不是,但我担心它是。
It soon became apparent that the main thing I did was to keep records, and the project would have become an infinite regress if the keeping of records had been carried through in the detail envisaged in the initial plan. It was not. Nevertheless, I obtained a picture of my activities that gave me pause. Perhaps my spectrum is not typical—I hope it is not, but I fear it is.
我大约 85% 的“思考”时间都花在了思考、做出决定、学习我需要知道的事情上。花在寻找或获取信息上的时间比花在消化信息上的时间要多得多。几个小时用于绘制图表,其他时间则用于指导助手如何绘制。当图表完成后,关系立刻就显而易见了,但为了使它们如此,必须进行绘图。在某一时刻,有必要比较将语音清晰度与语音噪声比相关的函数的六次实验测定。没有两个实验者使用相同的语噪比定义或测量方法。需要几个小时的计算才能将数据转化为可比较的形式。当它们处于可比较的状态时,只需要几秒钟就可以确定我需要知道什么。
About 85 per cent of my “thinking” time was spent getting into a position to think, to make a decision, to learn something I needed to know. Much more time went into finding or obtaining information than into digesting it. Hours went into the plotting of graphs, and other hours into instructing an assistant how to plot. When the graphs were finished, the relations were obvious at once, but the plotting had to be done in order to make them so. At one point, it was necessary to compare six experimental determinations of a function relating speech-intelligibility to speech-to-noise ratio. No two experimenters had used the same definition or measure of speech-to-noise ratio. Several hours of calculating were required to get the data into comparable form. When they were in comparable form, it took only a few seconds to determine what I needed to know.
简而言之,在我检查的整个期间,我的“思考”时间主要用于本质上是文书或机械的活动:搜索、计算、绘图、转换、确定一组假设或假设的逻辑或动态结果、准备决策或见解的方式。此外,我对尝试什么和不尝试什么的选择在很大程度上取决于文书可行性的考虑,而不是智力能力。
Throughout the period I examined, in short, my “thinking” time was devoted mainly to activities that were essentially clerical or mechanical: searching, calculating, plotting, transforming, determining the logical or dynamic consequences of a set of assumptions or hypotheses, preparing the way for a decision or an insight. Moreover, my choices of what to attempt and what not to attempt were determined to an embarrassingly great extent by considerations of clerical feasibility, not intellectual capability.
刚才描述的研究结果传达的主要建议是,据称大部分时间用于技术思维的操作是可以由机器比人更有效地执行的操作。由于这些操作必须在不同的变量上并以不可预见的和不断变化的顺序执行,这一事实造成了严重的问题。然而,如果这些问题能够通过在人与快速信息检索和数据处理机器之间建立共生关系的方式得到解决,那么合作互动显然将极大地改善思维过程。
The main suggestion conveyed by the findings just described is that the operations that fill most of the time allegedly devoted to technical thinking are operations that can be performed more effectively by machines than by men. Severe problems are posed by the fact that these operations have to be performed upon diverse variables and in unforeseen and continually changing sequences. If those problems can be solved in such a way as to create a symbiotic relation between a man and a fast information-retrieval and data-processing machine, however, it seems evident that the cooperative interaction would greatly improve the thinking process.
在这一点上,我们可能应该承认,我们使用“计算机”一词来涵盖一类广泛的计算、数据处理以及信息存储和检索机器。此类机器的功能几乎每天都在增加。因此,对班级的能力做出一般性的陈述是危险的。也许对人的能力做出一般性的陈述同样危险。尽管如此,人与计算机之间在能力上的某些基因型差异确实很突出,并且它们对可能的人机共生的性质以及实现它的潜在价值产生影响。
It may be appropriate to acknowledge, at this point, that we are using the term “computer” to cover a wide class of calculating, data-processing, and information-storage-and-retrieval machines. The capabilities of machines in this class are increasing almost daily. It is therefore hazardous to make general statements about capabilities of the class. Perhaps it is equally hazardous to make general statements about the capabilities of men. Nevertheless, certain genotypic differences in capability between men and computers do stand out, and they have a bearing on the nature of possible man–computer symbiosis and the potential value of achieving it.
正如人们以各种方式所说的那样,人类是嘈杂的窄带设备,但他们的神经系统有很多并行且同时活跃的通道。相对于人类,计算机非常快且非常准确,但它们一次只能执行一个或几个基本操作。男人很灵活,能够根据新收到的信息“随机地对自己进行编程”。计算机是专一的,受到“预编程”的限制。男人自然会说围绕单一物体和连贯动作组织的冗余语言,并使用 20 到 60 个基本符号。计算机“自然地”讲非冗余语言,通常只有两个基本符号,并且对单一对象或连贯动作没有固有的欣赏能力。
As has been said in various ways, men are noisy, narrow-band devices, but their nervous systems have very many parallel and simultaneously active channels. Relative to men, computing machines are very fast and very accurate, but they are constrained to perform only one or a few elementary operations at a time. Men are flexible, capable of “programming themselves contingently” on the basis of newly received information. Computing machines are single-minded, constrained by their “ pre-programming.” Men naturally speak redundant languages organized around unitary objects and coherent actions and employing 20 to 60 elementary symbols. Computers “naturally” speak nonredundant languages, usually with only two elementary symbols and no inherent appreciation either of unitary objects or of coherent actions.
为了严格正确,这些特征必须包括许多限定词。尽管如此,它们所呈现的差异图景(以及因此潜在的补充)本质上是有效的。计算机可以轻松、良好、快速地完成许多对人类来说困难或不可能的事情,而人类也可以轻松、良好(尽管不是很快)地完成许多对计算机来说困难或不可能的事情。这表明,如果成功地整合人与计算机的积极特征,共生合作将具有巨大的价值。当然,速度和语言的差异带来了必须克服的困难。
To be rigorously correct, those characterizations would have to include many qualifiers. Nevertheless, the picture of dissimilarity (and therefore potential supplementation) that they present is essentially valid. Computing machines can do readily, well, and rapidly many things that are difficult or impossible for man, and men can do readily and well, though not rapidly, many things that are difficult or impossible for computers. That suggests that a symbiotic cooperation, if successful in integrating the positive characteristics of men and computers, would be of great value. The differences in speed and in language, of course, pose difficulties that must be overcome.
在许多操作中,人类操作员和设备的贡献似乎会如此完全地融合在一起,以至于很难在分析中将它们整齐地分开。例如,如果在收集决策依据的数据时,人和计算机都从经验中得出了相关先例,并且计算机随后提出了与人的直觉判断一致的行动方案,那么情况就会如此。(在定理证明程序中,计算机在经验中寻找先例,在 SAGE 系统中,它们建议行动方案。上述并不是一个牵强的例子。)然而,在其他操作中,人员和设备的贡献将是某种程度上是可以分开的。
It seems likely that the contributions of human operators and equipment will blend together so completely in many operations that it will be difficult to separate them neatly in analysis. That would be the case if, in gathering data on which to base a decision, for example, both the man and the computer came up with relevant precedents from experience and if the computer then suggested a course of action that agreed with the man’s intuitive judgment. (In theorem-proving programs, computers find precedents in experience, and in the SAGE System, they suggest courses of action. The foregoing is not a far-fetched example.) In other operations, however, the contributions of men and equipment will be to some extent separable.
当然,至少在早期,男性会设定目标并提供动力。他们将提出假设。他们会问问题。他们会思考机制、程序和模型。他们会记得某某人在 1947 年,或者至少是在第二次世界大战后不久,就某个感兴趣的主题做了一些可能相关的工作,并且他们会知道这些工作可能发表在哪些期刊上。一般来说,他们会做出近似的、可能错误的但具有主导性的贡献,他们将定义标准并充当评估者,判断设备的贡献并指导总体思路。
Men will set the goals and supply the motivations, of course, at least in the early years. They will formulate hypotheses. They will ask questions. They will think of mechanisms, procedures, and models. They will remember that such-and-such a person did some possibly relevant work on a topic of interest back in 1947, or at any rate shortly after World War II, and they will have an idea in what journals it might have been published. In general, they will make approximate and fallible, but leading, contributions, and they will define criteria and serve as evaluators, judging the contributions of the equipment and guiding the general line of thought.
此外,当这种情况确实发生时,男人会处理概率极低的情况。(在当前的人机系统中,这是人类操作员最重要的功能之一。极低概率替代方案的概率之和往往太大而不容忽视。)人类将填补空白,无论是在当计算机没有适用于特定情况的模式或例程时,问题解决方案或计算机程序中。
In addition, men will handle the very-low-probability situations when such situations do actually arise. (In current man–machine systems, that is one of the human operator’s most important functions. The sum of the probabilities of very-low-probability alternatives is often much too large to neglect.) Men will fill in the gaps, either in the problem solution or in the computer program, when the computer has no mode or routine that is applicable in a particular circumstance.
就信息处理设备而言,它将把假设转换成可测试的模型,然后根据数据测试模型(人类操作员可以粗略地指定这些数据,并在计算机将其提供给他批准时识别为相关)。设备将回答问题。它将模拟机制和模型,执行程序,并将结果显示给操作员。它将转换数据,绘制图表(以人类操作员指定的任何方式“切蛋糕”,或者如果人类操作员不确定他想要什么,则以几种替代方式)。该设备将进行插值、推断和变换。它将静态方程或逻辑语句转换为动态模型,以便操作员可以检查他们的行为。一般来说,它将执行可例行公事的文书操作,以填补决策之间的间隔。
The information-processing equipment, for its part, will convert hypotheses into testable models and then test the models against data (which the human operator may designate roughly and identify as relevant when the computer presents them for his approval). The equipment will answer questions. It will simulate the mechanisms and models, carry out the procedures, and display the results to the operator. It will transform data, plot graphs (“cutting the cake” in whatever way the human operator specifies, or in several alternative ways if the human operator is not sure what he wants). The equipment will interpolate, extrapolate, and transform. It will convert static equations or logical statements into dynamic models so the human operator can examine their behavior. In general, it will carry out the routinizable, clerical operations that fill the intervals between decisions.
此外,只要有足够的基础来支持正式的统计分析,计算机将充当统计推断、决策理论或博弈论机器,对建议的行动方案进行基本评估。最后,它将尽可能多地进行诊断、模式匹配和相关性识别,但它在这些领域将接受明显的次要地位。
In addition, the computer will serve as a statistical-inference, decision-theory, or game-theory machine to make elementary evaluations of suggested courses of action whenever there is enough basis to support a formal statistical analysis. Finally, it will do as much diagnosis, pattern-matching, and relevance-recognizing as it profitably can, but it will accept a clearly secondary status in those areas.
上一节默认的数据处理设备不可用。计算机程序尚未编写。事实上,非共生的现在和预期的共生的未来之间存在着几个障碍。让我们研究其中的一些,以更清楚地了解需要什么以及实现它的机会有多大。
The data-processing equipment tacitly postulated in the preceding section is not available. The computer programs have not been written. There are in fact several hurdles that stand between the nonsymbiotic present and the anticipated symbiotic future. Let us examine some of them to see more clearly what is needed and what the chances are of achieving it.
似乎可以合理地设想,在 10 或 15 年后,一个“思维中心”将把当今图书馆的功能与信息存储和检索方面的预期进步以及本文前面提出的共生功能结合起来。这幅图景很容易扩大为这样的中心网络,这些中心通过宽带通信线路相互连接,并通过租用线路服务与个人用户连接。在这样的系统中,计算机的速度将得到平衡,巨大的存储器和复杂程序的成本将除以用户数量。
It seems reasonable to envision, for a time 10 or 15 years hence, a “thinking center” that will incorporate the functions of present-day libraries together with anticipated advances in information storage and retrieval and the symbiotic functions suggested earlier in this paper. The picture readily enlarges itself into a network of such centers, connected to one another by wide-band communication lines and to individual users by leased-wire services. In such a system, the speed of the computers would be balanced, and the cost of the gigantic memories and the sophisticated programs would be divided by the number of users.
首先要面对的是,我们不应该将所有技术和科学论文都存储在计算机内存中。我们可以存储可以最简洁概括的部分——定量部分和参考引文——但不是全部。书籍是现有设计最精美、最人性化的组件之一,在人机共生的背景下,它们将继续发挥重要的功能。(希望计算机能够加快书籍的查找、交付和归还速度。)
The first thing to face is that we shall not store all the technical and scientific papers in computer memory. We may store the parts that can be summarized most succinctly—the quantitative parts and the reference citations—but not the whole. Books are among the most beautifully engineered, and human-engineered, components in existence, and they will continue to be functionally important within the context of man–computer symbiosis. (Hopefully, the computer will expedite the finding, delivering, and returning of books.)
第二点是,记忆的一个非常重要的部分将是永久的:部分不可磨灭的记忆和部分公开的记忆。计算机将能够向不可擦除存储器写入一次,然后无限期地读回,但计算机将无法擦除不可擦除存储器。(它也可能会覆盖,将所有 0 变成 1,就像标记之前写入的内容一样。) 发布的内存将是“只读”内存。它将被引入到已经结构化的计算机中。计算机将能够重复引用它,但不能更改它。随着计算机变得越来越大,这些类型的内存将变得越来越重要。它们可以做得比核心、薄膜甚至磁带存储器更紧凑,而且价格也便宜得多。主要的工程问题将涉及选择电路。
The second point is that a very important section of memory will be permanent: part indelible memory and part published memory. The computer will be able to write once into indelible memory, and then read back indefinitely, but the computer will not be able to erase indelible memory. (It may also over-write, turning all the 0s into 1s, as though marking over what was written earlier.) Published memory will be “read-only” memory. It will be introduced into the computer already structured. The computer will be able to refer to it repeatedly, but not to change it. These types of memory will become more and more important as computers grow larger. They can be made more compact than core, thin-film, or even tape memory, and they will be much less expensive. The main engineering problems will concern selection circuitry.
就内存需求的其他方面而言,我们可以指望普通科学和商业计算机器的持续发展。内存元件有可能变得与处理(逻辑)元件一样快。这一发展将对计算机的设计产生革命性的影响。
In so far as other aspects of memory requirement are concerned, we may count upon the continuing development of ordinary scientific and business computing machines. There is some prospect that memory elements will become as fast as processing (logic) elements. That development would have a revolutionary effect upon the design of computers.
Trie 内存由其创始人 Fredkin (1960) 如此称呼,因为它的设计目的是为了方便信息检索,而且分支存储结构开发后类似于树。最常见的内存系统将参数函数存储在参数指定的位置。(在某种意义上,它们根本不存储参数。在另一种更现实的意义上,它们将所有可能的参数存储在内存的框架结构中。)另一方面,trie 内存系统存储这两个函数和论点。参数首先被引入内存,一次一个字符,从标准初始寄存器开始。每个参数寄存器对于整体的每个字符都有一个单元(例如,对于以二进制形式编码的信息有两个单元),并且每个字符单元在其内部具有用于下一个寄存器的地址的存储空间。参数通过写入一系列地址来存储,每个地址都告诉在哪里找到下一个地址。参数的末尾有一个特殊的“参数结束”标记。然后按照函数的指示,该函数以多种方式存储,进一步的特里结构或“列表结构”通常是最有效的。
Trie memory is so called by its originator, Fredkin (1960), because it is designed to facilitate retrieval of information and because the branching storage structure, when developed, resembles a tree. Most common memory systems store functions of arguments at locations designated by the arguments. (In one sense, they do not store the arguments at all. In another and more realistic sense, they store all the possible arguments in the framework structure of the memory.) The trie memory system, on the other hand, stores both the functions and the arguments. The argument is introduced into the memory first, one character at a time, starting at a standard initial register. Each argument register has one cell for each character of the ensemble (e.g., two for information encoded in binary form) and each character cell has within it storage space for the address of the next register. The argument is stored by writing a series of addresses, each one of which tells where to find the next. At the end of the argument is a special “end-of-argument” marker. Then follow directions to the function, which is stored in one or another of several ways, either further trie structure or “list structure” often being most effective.
trie 内存方案对于小内存来说效率较低,但随着内存大小的增加,它在使用可用存储空间方面变得越来越高效。该方案的吸引人的特点是: 1)检索过程极其简单。给定参数,输入第一个字符的标准初始寄存器,并获取第二个字符的地址。然后转到第二个寄存器,并获取第三个寄存器的地址,依此类推。 2) 如果两个参数具有共同的初始字符,则它们对这些字符使用相同的存储空间。3) 参数的长度不必相同,也不必预先指定。4) 在实际存储之前,不会为任何参数保留或使用任何存储空间。当项目被引入内存时,就会创建 trie 结构。5) 一个函数可以用作另一个函数的参数,并且该函数可以用作下一个函数的参数。因此,例如,通过输入参数“矩阵乘法”,人们可以检索用于在计算机上执行矩阵乘法的整个程序。6) 通过检查给定级别的存储,可以确定迄今为止存储了哪些类似的项目。例如,如果没有对 Egan, JP 的引用,那么只需向后退一两步即可找到 Egan, James 的踪迹……。
The trie memory scheme is inefficient for small memories, but it becomes increasingly efficient in using available storage space as memory size increases. The attractive features of the scheme are these: 1) The retrieval process is extremely simple. Given the argument, enter the standard initial register with the first character, and pick up the address of the second. Then go to the second register, and pick up the address of the third, etc. 2) If two arguments have initial characters in common, they use the same storage space for those characters. 3) The lengths of the arguments need not be the same, and need not be specified in advance. 4) No room in storage is reserved for or used by any argument until it is actually stored. The trie structure is created as the items are introduced into the memory. 5) A function can be used as an argument for another function, and that function as an argument for the next. Thus, for example, by entering with the argument, “matrix multiplication,” one might retrieve the entire program for performing a matrix multiplication on the computer. 6) By examining the storage at a given level, one can determine what thus-far similar items have been stored. For example, if there is no citation for Egan, J. P., it is but a step or two backward to pick up the trail of Egan, James ….
刚才描述的属性并不包括所有所需的属性,但它们使计算机存储与人类操作员以及他们通过命名或指向来指定事物的偏好产生共鸣。
The properties just described do not include all the desired ones, but they bring computer storage into resonance with human operators and their predilection to designate things by naming or pointing.
然而,为了人与计算机之间的实时合作,有必要利用额外的、相当不同的通信和控制原理。通过将通常针对智能人类的指令与通常用于计算机的指令进行比较,可以突出这个想法。后者精确地指定了要采取的各个步骤以及采取这些步骤的顺序。前者呈现或暗示有关激励或动机的内容,并且它们提供了一个标准,指令的人类执行者将通过该标准知道他何时完成了任务。简而言之:针对计算机的指令指定了课程;针对人类的指令明确了目标。
For the purposes of real-time cooperation between men and computers, it will be necessary, however, to make use of an additional and rather different principle of communication and control. The idea may be highlighted by comparing instructions ordinarily addressed to intelligent human beings with instructions ordinarily used with computers. The latter specify precisely the individual steps to take and the sequence in which to take them. The former present or imply something about incentive or motivation, and they supply a criterion by which the human executor of the instructions will know when he has accomplished his task. In short: instructions directed to computers specify courses; instructions directed to human beings specify goals.
男性似乎更自然、更容易地思考目标而不是课程。诚然,他们通常知道一些旅行方向或工作路线,但很少有人一开始就制定精确的行程。例如,谁会从波士顿出发前往洛杉矶并提供详细的路线说明?相反,用维纳的话来说,前往洛杉矶的人们不断地努力减少他们尚未陷入雾霾的时间。
Men appear to think more naturally and easily in terms of goals than in terms of courses. True, they usually know something about directions in which to travel or lines along which to work, but few start out with precisely formulated itineraries. Who, for example, would depart from Boston for Los Angeles with a detailed specification of the route? Instead, to paraphrase Wiener, men bound for Los Angeles try continually to decrease the amount by which they are not yet in the smog.
通过明确目标的计算机指令正在沿着两条路径进行。第一个涉及解决问题、爬山、自组织计划。第二个涉及预编程段和封闭子例程的实时串联,操作人员可以简单地通过名称来指定和调用这些子例程。
Computer instruction through specification of goals is being approached along two paths. The first involves problem-solving, hill-climbing, self-organizing programs. The second involves real-time concatenation of preprogrammed segments and closed subroutines which the human operator can designate and call into action simply by name.
沿着第一条道路,已经出现了有希望的探索性工作。显然,在预定策略的宽松约束下工作,计算机在适当的时候将能够设计和简化自己的程序以实现既定目标。到目前为止,所取得的成就并不具有实质性意义;它们仅构成“原则上的示威”。然而,其影响是深远的。
Along the first of these paths, there has been promising exploratory work. It is clear that, working within the loose constraints of predetermined strategies, computers will in due course be able to devise and simplify their own procedures for achieving stated goals. Thus far, the achievements have not been substantively important; they have constituted only “demonstration in principle.” Nevertheless, the implications are far-reaching.
尽管第二条路径更简单并且显然能够更早实现,但它相对被忽视了。Fredkin 的特里记忆提供了一个有前途的范例。在适当的时候,我们可能会看到人们认真努力开发计算机程序,这些程序可以像语音中的单词和短语一样连接在一起,以完成目前所需的任何计算或控制。显然,阻碍这种努力的考虑因素是,这种努力不会产生在现有计算机环境中具有重大价值的成果。在任何计算机器能够对这种语言做出有意义的响应之前,开发这种语言是没有回报的。
Although the second path is simpler and apparently capable of earlier realization, it has been relatively neglected. Fredkin’s trie memory provides a promising paradigm. We may in due course see a serious effort to develop computer programs that can be connected together like the words and phrases of speech to do whatever computation or control is required at the moment. The consideration that holds back such an effort, apparently, is that the effort would produce nothing that would be of great value in the context of existing computers. It would be unrewarding to develop the language before there are any computing machines capable of responding meaningfully to it.
显示器的状态似乎比控件的状态要好一些。许多计算机在示波器屏幕上绘制图形,还有一些计算机利用字符显示管的图形和符号卓越功能。然而,据我所知,没有任何东西可以比得上人们在技术讨论中使用的铅笔和涂鸦板或粉笔和黑板的灵活性和便利性。
Displays seem to be in a somewhat better state than controls. Many computers plot graphs on oscilloscope screens, and a few take advantage of the remarkable capabilities, graphical and symbolic, of the charactron display tube. Nowhere, to my knowledge, however, is there anything approaching the flexibility and convenience of the pencil and doodle pad or the chalk and blackboard used by men in technical discussion.
1)桌面显示和控制:当然,为了有效的人机交互,人和计算机有必要在同一显示表面上互相绘制图表和图片、书写笔记和方程。该人应该能够通过绘制图表以粗略但快速的方式向计算机呈现函数。计算机应该读取该人的书写,也许条件是它是清晰的大写字母,并且它应该立即在每个手绘符号的位置张贴相应的字符,并将其解释为精确的字体。有了这样的输入输出设备,操作员将很快学会以机器可读的方式书写或打印。他可以编写指令和子程序,将它们设置为适当的格式,并在最终将它们引入计算机主存储器之前对其进行检查。他甚至可以定义新的符号,就像 Gilmore 和 Savell (1959) 在林肯实验室所做的那样,并将它们直接呈现给计算机。他可以粗略地勾勒出表格的格式,然后让计算机精确地塑造它。他可以纠正计算机的数据,通过流程图指导机器,并且通常与它进行交互,就像他与另一位工程师一样,只是“另一位工程师”将是一名精确的绘图员,一个闪电计算器,一个助记向导,和许多其他有价值的合作伙伴合而为一。
1) Desk-Surface Display and Control: Certainly, for effective man–computer interaction, it will be necessary for the man and the computer to draw graphs and pictures and to write notes and equations to each other on the same display surface. The man should be able to present a function to the computer, in a rough but rapid fashion, by drawing a graph. The computer should read the man’s writing, perhaps on the condition that it be in clear block capitals, and it should immediately post, at the location of each hand-drawn symbol, the corresponding character as interpreted and put into precise type-face. With such an input-output device, the operator would quickly learn to write or print in a manner legible to the machine. He could compose instructions and subroutines, set them into proper format, and check them over before introducing them finally into the computer’s main memory. He could even define new symbols, as Gilmore and Savell (1959) have done at the Lincoln Laboratory, and present them directly to the computer. He could sketch out the format of a table roughly and let the computer shape it up with precision. He could correct the computer’s data, instruct the machine via flow diagrams, and in general interact with it very much as he would with another engineer, except that the “other engineer” would be a precise draftsman, a lightning calculator, a mnemonic wizard, and many other valuable partners all in one.
2)计算机发布的墙壁显示:在某些技术系统中,几个人共同负责控制行为相互作用的车辆。有些信息必须同时呈现给所有人,最好是在一个共同的网格上,以协调他们的行动。其他信息仅与一两个操作员相关。如果所有信息都在一台显示器上向所有人呈现,只会造成难以解释的混乱。该信息必须通过计算机发布,因为手动绘图太慢而无法保持最新。
2) Computer-Posted Wall Display: In some technological systems, several men share responsibility for controlling vehicles whose behaviors interact. Some information must be presented simultaneously to all the men, preferably on a common grid, to coordinate their actions. Other information is of relevance only to one or two operators. There would be only a confusion of uninterpretable clutter if all the information were presented on one display to all of them. The information must be posted by a computer, since manual plotting is too slow to keep it up to date.
刚才概述的问题即使在现在也是一个关键问题,而且随着时间的推移,它似乎肯定会变得越来越关键。几位设计师坚信,显示器能够满足您的需求基于光阀原理,可以借助闪光灯和分时观察屏来构建特性。
The problem just outlined is even now a critical one, and it seems certain to become more and more critical as time goes by. Several designers are convinced that displays with the desired characteristics can be constructed with the aid of flashing lights and time-sharing viewing screens based on the light-valve principle.
大多数考虑过这个问题的人都认为,大型显示器应该由单独的显示控制单元来补充。后者将允许操作员在不离开其位置的情况下修改墙壁显示。出于某些目的,操作员希望能够通过辅助显示器甚至可能通过墙壁显示器与计算机进行通信。至少有一种提供这种通信的方案似乎是可行的。
The large display should be supplemented, according to most of those who have thought about the problem, by individual display-control units. The latter would permit the operators to modify the wall display without leaving their locations. For some purposes, it would be desirable for the operators to be able to communicate with the computer through the supplementary displays and perhaps even through the wall display. At least one scheme for providing such communication seems feasible.
当然,大型墙壁显示器及其相关系统与计算机和团队之间的共生合作有关。实验室实验一再表明,操作员的非正式、并行安排,通过参考大的情况显示来协调他们的活动,比更广泛使用的安排具有重要的优势,后者将操作员定位在单独的控制台上,并试图通过计算机的代理机构。这是需要仔细研究的几个运营团队问题之一。
The large wall display and its associated system are relevant, of course, to symbiotic cooperation between a computer and a team of men. Laboratory experiments have indicated repeatedly that informal, parallel arrangements of operators, coordinating their activities through reference to a large situation display, have important advantages over the arrangement, more widely used, that locates the operators at individual consoles and attempts to correlate their actions through the agency of a computer. This is one of several operator-team problems in need of careful study.
3)自动语音生成和识别:人类操作员和计算机之间的语音通信有多理想和可行?每当讨论复杂的数据处理系统时,就会提出这个复合问题。与计算机一起工作和生活的工程师对计算机的需求采取保守的态度。在自动语音识别领域有过经验的工程师对可行性持保守态度。然而,人们对与计算机对话的想法仍然感兴趣。在很大程度上,这种兴趣源于这样一种认识:人们很难让军事指挥官或公司总裁离开工作去教他打字。如果高层决策者直接使用计算机器,那么通过最自然的方式提供通信可能是值得的,即使成本相当高。
3) Automatic Speech Production and Recognition: How desirable and how feasible is speech communication between human operators and computing machines? That compound question is asked whenever sophisticated data-processing systems are discussed. Engineers who work and live with computers take a conservative attitude toward the desirability. Engineers who have had experience in the field of automatic speech recognition take a conservative attitude toward the feasibility. Yet there is continuing interest in the idea of talking with computing machines. In large part, the interest stems from realization that one can hardly take a military commander or a corporation president away from his work to teach him to type. If computing machines are ever to be used directly by top-level decision makers, it may be worthwhile to provide communication via the most natural means, even at considerable cost.
对他的问题和时间尺度的初步分析表明,公司总裁只会将与计算机的共生关系作为一种副业感兴趣。业务情况通常进展缓慢,以便有时间进行简报和会议。因此,计算机专家是直接与商务办公室中的计算机进行交互的人员,这似乎是合理的。
Preliminary analysis of his problems and time scales suggests that a corporation president would be interested in a symbiotic association with a computer only as an avocation. Business situations usually move slowly enough that there is time for briefings and conferences. It seems reasonable, therefore, for computer specialists to be the ones who interact directly with computers in business offices.
另一方面,军事指挥官更有可能在短时间内做出关键决策。人们很容易夸大十分钟战争的概念,但指望有十多分钟的时间来做出关键决定是危险的。因此,随着军事系统地面环境和控制中心的能力和复杂性的增长,对计算机自动语音生成和识别的真正需求似乎可能会发展。当然,如果设备已经开发出来、可靠且可用,就会使用它。
The military commander, on the other hand, faces a greater probability of having to make critical decisions in short intervals of time. It is easy to overdramatize the notion of the ten-minute war, but it would be dangerous to count on having more than ten minutes in which to make a critical decision. As military system ground environments and control centers grow in capability and complexity, therefore, a real requirement for automatic speech production and recognition in computers seems likely to develop. Certainly, if the equipment were already developed, reliable, and available, it would be used.
就可行性而言,语音产生所带来的技术问题不如语音自动识别那么严重。商用电子数字电压表现在可以逐位大声读出其指示。八年或十年来,在贝尔电话实验室、皇家理工学院(斯德哥尔摩)、信号研究与开发机构(基督城)、哈斯金斯实验室和麻省理工学院,邓恩(1950 年);范特(1959);劳伦斯(1956);库珀等人。(1952);史蒂文斯等人。(1953)和他们的同事已经展示了连续几代可理解的自动说话者。哈斯金斯实验室最近的工作导致了一种适合计算机使用的数字代码的开发,它可以使自动语音发出可理解的连接话语(Liberman 等人,1959)。
In so far as feasibility is concerned, speech production poses less severe problems of a technical nature than does automatic recognition of speech sounds. A commercial electronic digital voltmeter now reads aloud its indications, digit by digit. For eight or ten years, at the Bell Telephone Laboratories, the Royal Institute of Technology (Stockholm), the Signals Research and Development Establishment (Christchurch), the Haskins Laboratory, and the Massachusetts Institute of Technology, Dunn (1950); Fant (1959); Lawrence (1956); Cooper et al. (1952); Stevens et al. (1953), and their co-workers, have demonstrated successive generations of intelligible automatic talkers. Recent work at the Haskins Laboratory has led to the development of a digital code, suitable for use by computing machines, that makes an automatic voice utter intelligible connected discourse (Liberman et al., 1959).
自动语音识别的可行性在很大程度上取决于要识别的词汇量的大小以及说话者的多样性和口音必须起作用的程度。几年前,贝尔电话实验室和林肯实验室证明了对自然口语十进制数字的 98% 正确识别率(Davis 等人,1952 年;Forgie 和 Forgie,1959 年)。词汇量大小,我们可以说现在几乎可以肯定可以在现有知识的基础上开发出清晰发音的字母数字字符的自动识别器。由于未经培训的操作员的阅读速度至少与经过培训的操作员打字的速度一样快,因此这种设备将成为几乎所有计算机安装中的便捷工具。
The feasibility of automatic speech recognition depends heavily upon the size of the vocabulary of words to be recognized and upon the diversity of talkers and accents with which it must work. Ninety-eight per cent correct recognition of naturally spoken decimal digits was demonstrated several years ago at the Bell Telephone Laboratories and at the Lincoln Laboratory (Davis et al., 1952; Forgie and Forgie, 1959), To go a step up the scale of vocabulary size, we may say that an automatic recognizer of clearly spoken alpha-numerical characters can almost surely be developed now on the basis of existing knowledge. Since untrained operators can read at least as rapidly as trained ones can type, such a device would be a convenient tool in almost any computer installation.
然而,对于真正共生水平上的实时交互,可能需要大约2000个单词的词汇量,例如1000个单词(例如基本英语)和1000个技术术语。这是一个具有挑战性的问题。在声学家和语言学家的共识中,目前还无法完成2000个单词的识别器的构建。然而,有几个组织很乐意承诺每五年开发一个针对此类词汇的自动识别系统。他们会规定演讲要清晰,听写风格,没有异常口音。
For real-time interaction on a truly symbiotic level, however, a vocabulary of about 2000 words, e.g., 1000 words of something like basic English and 1000 technical terms, would probably be required. That constitutes a challenging problem. In the consensus of acousticians and linguists, construction of a recognizer of 2000 words cannot be accomplished now. However, there are several organizations that would happily undertake to develop an automatic recognize for such a vocabulary on a five-year basis. They would stipulate that the speech be clear speech, dictation style, without unusual accent.
尽管对自动语音识别技术的详细讨论超出了当前的范围,但值得注意的是计算机在自动语音识别器的开发中发挥着主导作用。他们为目前的乐观情绪,或者更确切地说,为目前某些方面的乐观情绪提供了动力。两三年前,对于相当大的词汇量的自动识别似乎需要十年或十五年才能实现。还需要等待语音交流中声学、语音、语言和心理过程知识的进一步积累。然而,现在,许多人看到了借助计算机处理语音信号来加速获取这些知识的前景,并且不少工作人员认为,即使没有语音模式识别,复杂的计算机程序也能很好地执行。语音信号和过程的大量实质性知识的帮助。将这两个考虑因素放在一起,可以将实现具有实际意义的语音识别所需的时间估计缩短到也许五年,即刚才提到的五年。
Although detailed discussion of techniques of automatic speech recognition is beyond the present scope, it is fitting to note that computing machines are playing a dominant role in the development of automatic speech recognizers. They have contributed the impetus that accounts for the present optimism, or rather for the optimism presently found in some quarters. Two or three years ago, it appeared that automatic recognition of sizable vocabularies would not be achieved for ten or fifteen years; that it would have to await much further, gradual accumulation of knowledge of acoustic, phonetic, linguistic, and psychological processes in speech communication. Now, however, many see a prospect of accelerating the acquisition of that knowledge with the aid of computer processing of speech signals, and not a few workers have the feeling that sophisticated computer programs will be able to perform well as speech-pattern recognizes even without the aid of much substantive knowledge of speech signals and processes. Putting those two considerations together brings the estimate of the time required to achieve practically significant speech recognition down to perhaps five years, the five years just mentioned.
经电气和电子工程师协会许可,转载自 Licklider (1960)。
Reprinted from Licklider (1960), with permission from the Institute of Electrical and Electronics Engineers.
约翰·麦卡锡 (John McCarthy,1927-2011) 在加州理工学院本科学习数学,在普林斯顿大学攻读博士学位。1956 年,麦卡锡在达特茅斯学院任教期间,在那里举办了一个暑期学校,主题是他所谓的“人工智能”。根据由 McCarthy、Shannon、Marvin Minsky(当时为哈佛大学初级研究员)和 IBM 研究员 Nathaniel Rochester 共同撰写的会议提案(McCarthy,1960),会议的主题将包括 (1) “自动计算机”(如何编写程序来模拟“人脑的高级功能”);(2)“如何对计算机进行编程以使用一种语言”;(3) “神经网络”(引用 McCulloch 和 Pitts);(4)“计算规模理论”(“函数复杂性理论”);(5)“自我完善”;(6)“摘要”;(7)“随机性和创造性”。所有这些至今仍然是活跃的研究问题!
John McCarthy (1927–2011) studied mathematics at Cal Tech as an undergraduate and at Princeton as a PhD student. While teaching at Dartmouth College in 1956, McCarthy hosted a summer school there on what he dubbed “Artificial Intelligence.” According to the proposal for the conference (McCarthy, 1960), co-authored by McCarthy, Shannon, Marvin Minsky (then a Junior Fellow at Harvard), and the IBM researcher Nathaniel Rochester, the subjects for the conference would include (1) “Automatic computers” (how to write programs to simulate “the higher functions of the human brain”); (2) “How can a computer be programmed to use a language”; (3) “Neuron nets” (citing McCulloch and Pitts); (4) “Theory of the size of a calculation” (“a theory of the complexity of functions”); (5) “Self-improvement”; (6) “Abstractions”; and (7) “Randomness and creativity.” All are still active research problems today!
麦卡锡搬到了麻省理工学院,在那里他对分时技术的诞生发挥了重要作用(第 23 章)。在那里,他发明了 L ISP编程语言,作为一种符号推理工具,他预计这将是人工智能进步的关键。这是一个大胆的举动,直接借用了 Alonzo Church 开发的 lambda 演算作为解决希尔伯特 Entscheidungs 问题的数学语言。在 L ISP的最初描述中,该语言纯粹是解释性的;编译器的开发需要数年时间。麦卡锡必须开发内存管理的垃圾收集技术,以便即使是小程序也可以在当时内存有限的机器上执行。L ISP不仅可用而且具有影响力;McCarthy 参与了 A LGOL 60 的设计,这是第一个广泛使用的具有递归功能的通用语言,而 L ISP影响了后续每种函数式编程语言的设计。
McCarthy moved to MIT, where he was instrumental in the birth of time-sharing (chapter 23). There he invented the LISP programming language as a tool for the sort of symbolic reasoning he anticipated would be key to progress in AI. It was an audacious move, to borrow so directly from the lambda-calculus that Alonzo Church had developed as a mathematical language for resolving Hilbert’s Entscheidungsproblem. In this initial description of LISP, the language was purely interpretive; it would take years before compilers were developed. McCarthy had to develop the garbage collection technique of memory management in order to make even small programs executable on the memory-limited machines of the day. LISP has remained not only usable but influential; McCarthy was involved in the design of ALGOL 60, the first widely-used general purpose language featuring recursion, and LISP has influenced the design of every subsequent functional programming language.
1962 年,麦卡锡搬到了斯坦福大学,在那里他创办了人工智能实验室 (SAIL),该实验室是大量有影响力的人工智能研究的铸造厂。不仅是麦卡锡,SAIL 的其他 15 个附属机构也获得了图灵奖。在他的整个职业生涯中,他几乎独特地将对形式、数学、逻辑基础的根深蒂固的尊重与产生模仿人类思想各个方面的工作代码的雄心结合起来。
In 1962 McCarthy moved to Stanford, where he started the Artificial Intelligence Laboratory (SAIL), the foundry of vast amounts of influential AI research. Not only McCarthy but fifteen other affiliates of the SAIL have been recognized with the Turing Award. Throughout his career, he almost uniquely combined a deeply rooted respect for formal, mathematical, logical foundations with the ambition to produce working code that emulated aspects of human thought.
麻省理工学院的人工智能小组为 IBM 704 计算机开发了一种名为 L ISP(LISt 处理器)的编程系统。该系统旨在促进名为 Advice Taker 的提议系统的实验,通过该系统可以指示机器处理陈述句和祈使句,并且在执行指令时可以表现出“常识”。Advice Taker 的最初提案(McCarthy,1961)于 1958 年 11 月提出。主要要求是一个编程系统,用于操作代表形式化陈述句和祈使句的表达式,以便 Advice Taker 系统可以进行推论。
A programming system called LISP (for LISt Processor) has been developed for the IBM 704 computer by the Artificial Intelligence group at M.I.T. The system was designed to facilitate experiments with a proposed system called the Advice Taker, whereby a machine could be instructed to handle declarative as well as imperative sentences and could exhibit “common sense” in carrying out its instructions. The original proposal (McCarthy, 1961) for the Advice Taker was made in November 1958. The main requirement was a programming system for manipulating expressions representing formalized declarative and imperative sentences so that the Advice Taker system could make deductions.
在其发展过程中,L ISP系统经历了几个简化阶段,最终基于一种表示某类符号表达式的部分递归函数的方案。这种表示独立于 IBM 704 计算机或任何其他电子计算机,现在看来,从称为 S 表达式的表达式类和称为 S 函数的函数开始来阐述该系统似乎是方便的。
In the course of its development the LISP system went through several stages of simplification and eventually came to be based on a scheme for representing the partial recursive functions of a certain class of symbolic expressions. This representation is independent of the IBM 704 computer, or of any other electronic computer, and it now seems expedient to expound the system by starting with the class of expressions called S-expressions and the functions called S-functions.
在本文中,我们首先描述递归定义函数的形式。我们相信这种形式主义作为编程语言和发展计算理论的工具都具有优势。接下来,我们描述S-表达式和S-函数,并给出一些例子,然后描述通用S-函数的应用,它起到通用图灵机的理论作用和解释器的实际作用。然后,我们通过类似于 Newell 和 Shaw (1957) 使用的列表结构来描述 IBM 704 存储器中 S 表达式的表示,以及通过程序来表示 S 函数。然后我们提到了IBM 704的L ISP编程系统的主要特点。接下来是另一种用符号表达式描述计算的方法,最后我们给出了流程图的递归函数解释。
In this article, we first describe a formalism for defining functions recursively. We believe this formalism has advantages both as a programming language and as vehicle for developing a theory of computation. Next, we describe S-expressions and S-functions, give some examples, and then describe the universal S-function apply which plays the theoretical role of a universal Turing machine and the practical role of an interpreter. Then we describe the representation of S-expressions in the memory of the IBM 704 by list structures similar to those used by Newell and Shaw (1957), and the representation of S-functions by program. Then we mention the main features of the LISP programming system for the IBM 704. Next comes another way of describing computations with symbolic expressions, and finally we give a recursive function interpretation of flow charts.
我们希望描述在另一篇论文中使用 L ISP的一些符号计算,并在其他地方给出我们的递归函数形式主义在数理逻辑和机械定理证明问题中的一些应用。
We hope to describe some of the symbolic computations for which LISP has been used in another paper, and also to give elsewhere some applications of our recursive function formalism to mathematical logic and to the problem of mechanical theorem proving.
我们需要一些关于一般函数的数学思想和符号。大多数想法是众所周知的,但条件表达式的概念被认为是新的,并且条件表达式的使用允许以新的且方便的方式递归地定义函数。
We shall need a number of mathematical ideas and notations concerning functions in general. Most of the ideas are well known, but the notion of conditional expression is believed to be new, and the use of conditional expressions permits functions to be defined recursively in a new and convenient way.
A。部分功能。偏函数是仅在其定义域的一部分上定义的函数。当通过计算定义函数时,必然会出现偏函数,因为对于参数的某些值,定义函数值的计算可能不会终止。然而,我们的一些初等函数将被定义为偏函数。
a. Partial Functions. A partial function is a function that is defined only on part of its domain. Partial functions necessarily arise when functions are defined by computations because for some values of the arguments the computation defining the value of the function may not terminate. However, some of our elementary functions will be defined as partial functions.
b. 命题表达式和谓词。命题表达式是其可能值为 T(真值)和 F(假值)的表达式。我们假设读者是熟悉命题连接词 ∧(“和”)、∨(“或”)和∼(“非”),典型的命题表达式是:
b. Propositional Expressions and Predicates. A propositional expression is an expression whose possible values are T (for truth) and F (for falsity). We shall assume that the reader is familiar with the propositional connectives ∧ (“and”), ∨ (“or”), and ∼ (“not”), Typical propositional expressions are:
谓词是一个函数,其范围由真值 T 和 F 组成。
A predicate is a function whose range consists of the truth values T and F.
C。条件表达式。真值对其他类型量值的依赖性在数学中通过谓词表达,真值对其他真值的依赖性通过逻辑连接词表达。然而,用于象征性地表达其他种类的量对真值的依赖性的符号是不够的,因此在象征性地描述其他依赖性的文本中通常使用英语单词和短语来表达这些依赖性。例如,函数 | x | 通常是用文字来定义的。
c. Conditional Expressions. The dependence of truth values on the values of quantities of other kinds is expressed in mathematics by predicates, and the dependence of truth values on other truth values by logical connectives. However, the notations for expressing symbolically the dependence of quantities of other kinds on truth-values is inadequate, so that English words and phrases are generally used for expressing these dependences in texts that describe other dependences symbolically. For example, the function |x| is usually defined in words.
条件表达式是表达量对命题量的依赖关系的一种手段。条件表达式的形式为 ( p 1 → e 1 , ⋯ , p n → en ),其中p是命题表达式,e是任何类型的表达式。可以读作“如果p 1则e 1,否则如果p 2则e 2 , ⋯,否则如果p n则en ”,或者“ p 1产生e 1 , ⋯ , p n产生en ”。
Conditional expressions are a device for expressing the dependence of quantities on propositional quantities. A conditional expression has the form (p1 → e1, ⋯ , pn → en), where the p’s are propositional expressions and the e’s are expressions of any kind. It may be read, “If p1 then e1, otherwise if p2 then e2, ⋯, otherwise if pn then en,” or “p1 yields e1, ⋯, pn yields en.”
现在我们给出确定值 ( p 1 → e 1 , ⋯ , p n → e n ) 是否已定义的规则,如果是的话,其值是多少。从左到右检查p 。如果在遇到任何值为未定义的p之前遇到值为 T 的p ,则条件表达式的值为相应e的值(如果已定义)。如果在 true p之前遇到任何未定义的p,或者所有p都为 false,或者如果与第一个 true p对应的e未定义,则条件表达式的值未定义。我们现在举例说明。
We now give the rules for determining whether the value (p1 → e1, ⋯ , pn → en) is defined, and if so what its value is. Examine the p’s from left to right. If a p whose value is T is encountered before any p whose value is undefined is encountered, then the value of the conditional expression is the value of the corresponding e (if this is defined). If any undefined p is encountered before a true p, or if all p’s are false, or if the e corresponding to the first true p is undefined, then the value of the conditional expression is undefined. We now give examples.
条件表达式的一些最简单的应用是给出这样的定义:
Some of the simplest applications of conditional expressions are in giving such definitions as
d. 递归函数定义。通过使用条件表达式,我们可以通过定义函数出现的公式来定义函数,而无需循环。例如,我们写
d. Recursive Function Definitions. By using conditional expressions we can, without circularity, define functions by formulas in which the defined function occurs. For example, we write
当我们用这个公式来计算0!我们得到答案1;由于条件表达式值的定义方式,表达式 0 · (0 − 1)! 毫无意义。不会出现。评价2!根据这个定义,进行如下:
When we use this formula to evaluate 0! we get the answer 1; because of the way in which the value of a conditional expression was defined, the meaningless expression 0 · (0 − 1)! does not arise. The evaluation of 2! according to this definition proceeds as follows:
现在我们给出递归函数定义的另外两个应用。两个正整数m和n的最大公约数 gcd( m, n )通过欧几里德算法计算。该算法用递归函数定义来表示:
We now give two other applications of recursive function definitions. The greatest common divisor, gcd(m, n), of two positive integers m and n is computed by means of the Euclidean algorithm. This algorithm is expressed by the recursive function definition:
其中 rem( n, m ) 表示n除以m后剩下的余数。
where rem(n, m) denotes the remainder left when n is divided by m.
用于获取数字a的近似平方根的牛顿算法,从初始近似值x开始,并要求可接受的近似值y满足 | y 2 − a | < ε,可以写成
The Newtonian algorithm for obtaining an approximate square root of a number a, starting with an initial approximation x and requiring that an acceptable approximation y satisfy |y2 − a| < ε, may be written
同时递归定义多个函数也是可能的,如果需要的话我们将使用这样的定义。无法保证由递归定义确定的计算将永远终止,例如尝试计算n!根据我们的定义,只有当n是非负整数时才会成功。如果计算没有终止,则该函数必须被视为对于给定参数未定义。
The simultaneous recursive definition of several functions is also possible, and we shall use such definitions if they are required. There is no guarantee that the computation determined by a recursive definition will ever terminate and, for example, an attempt to compute n! from our definition will only succeed if n is a non-negative integer. If the computation does not terminate, the function must be regarded as undefined for the given arguments.
命题连接词本身可以通过条件表达式来定义。我们写
The propositional connectives themselves can be defined by conditional expressions. We write
很容易看出方程的右侧具有正确的真值表。如果我们考虑p或q可能未定义的情况,则连接词 ∧ 和 ∨ 被视为不可交换。例如,如果p为假且q未定义,则根据上面给出的定义,我们看到p ∧ q为假,但q ∧ p未定义。对于我们的应用程序来说,这种非交换性是可取的,因为p ∧ q是通过首先计算p来计算的,并且如果p为假则不计算q 。如果p的计算没有终止,我们就永远不会计算q。下文中我们将在这个意义上使用命题连接词。
It is readily seen that the right-hand sides of the equations have the correct truth tables. If we consider situations in which p or q may be undefined, the connectives ∧ and ∨ are seen to be noncommutative. For example if p is false and q is undefined, we see that according to the definitions given above p ∧ q is false, but q ∧ p is undefined. For our applications this noncommutativity is desirable, since p ∧ q is computed by first computing p, and if p is false q is not computed. If the computation for p does not terminate, we never get around to computing q. We shall use propositional connectives in this sense hereafter.
e. 功能和形式。在数学中(数理逻辑之外),通常会不精确地使用“函数”一词并将其应用于y 2 + x等形式。因为我们稍后将使用函数表达式进行计算,所以我们需要函数和形式之间的区别以及用于表达这种区别的符号。这种区别和描述它的符号是由 Church (1941) 给出的,我们稍微偏离了这一点。
e. Functions and Forms. It is usual in mathematics—outside of mathematical logic—to use the word “function” imprecisely and to apply it to forms such as y2 + x. Because we shall later compute with expressions for functions, we need a distinction between functions and forms and a notation for expressing this distinction. This distinction and a notation for describing it, from which we deviate trivially, is given by Church (1941).
令f为代表两个整数变量函数的表达式。写成f (3, 4) 应该是有意义的,并且应该确定该表达式的值。表达式y 2 + x不满足这个要求;y 2 + x (3, 4) 不是一个传统的表示法,如果我们试图定义它,我们将不确定它的值是 13 还是 19。Church 将像 y 2 + x 这样的表达式称为一种形式。如果我们可以确定表单中出现的变量与所需函数的参数的有序列表之间的对应关系,则可以将表单转换为函数。这是通过 Church 的λ表示法完成的。
Let f be an expression that stands for a function of two integer variables. It should make sense to write f(3, 4) and the value of this expression should be determined. The expression y2 + x does not meet this requirement; y2 + x(3, 4) is not a conventional notation, and if we attempted to define it we would be uncertain whether its value would turn out to be 13 or 19. Church calls an expression like y2 + x a form. A form can be converted into a function if we can determine the correspondence between the variables occurring in the form and the ordered list of arguments of the desired function. This is accomplished by Church’s λ-notation.
如果ℰ是变量x 1 , ⋯ , x n的形式,则λ (( x 1 , ⋯ , x n ), ℰ ) 将被视为n 个变量的函数,其值通过将参数替换为变量x 1 , ⋯ , x n按该顺序放入ℰ中并计算结果表达式。例如,λ (( x, y ) , y 2 + x ) 是两个变量的函数,并且λ (( x, y ) , y 2 + x )(3, 4) = 19。
If ℰ is a form in variables x1, ⋯ , xn, then λ((x1, ⋯ , xn), ℰ) will be taken to be the function of n variables whose value is determined by substituting the arguments for the variables x1, ⋯, xn in that order in ℰ and evaluating the resulting expression. For example, λ((x, y), y2 + x) is a function of two variables, and λ((x, y), y2 + x)(3, 4) = 19.
出现在λ表达式的变量列表中的变量是虚拟的或有界的,就像定积分中的积分变量一样。也就是说,我们可以更改函数表达式中绑定变量的名称,而不更改表达式的值,前提是我们对每次出现的变量进行相同的更改,并且不会使两个变量与之前不同的变量相同。因此,λ (( x, y ) , y 2 + x )、λ (( u, v ) , v 2 + u ) 和λ (( y, x ) , x 2 + y ) 表示相同的函数。
The variables occurring in the list of variables of a λ-expression are dummy or bound, like variables of integration in a definite integral. That is, we may change the names of the bound variables in a function expression without changing the value of the expression, provided that we make the same change for each occurrence of the variable and do not make two variables the same that previously were different. Thus λ((x, y), y2 + x), λ((u, v), v2 + u) and λ((y, x), x2 + y) denote the same function.
我们将经常使用其中一些变量受λ约束而其他变量则不受约束的表达式。这样的表达式可以被视为定义带参数的函数。未绑定的变量称为自由变量。
We shall frequently use expressions in which some of the variables are bound by λ’s and others are not. Such an expression may be regarded as defining a function with parameters. The unbound variables are called free variables.
区分函数和形式的适当符号可以明确地处理函数的函数。在这里给出例子会涉及太多的题外话,但我们将在本报告的后面部分使用带有函数作为参数的函数。……
An adequate notation that distinguishes functions from forms allows an unambiguous treatment of functions of functions. It would involve too much of a digression to give examples here, but we shall use functions with functions as arguments later in this report. …
我们首先根据有序对和列表定义一类符号表达式。然后我们将定义五个基本函数和谓词,并通过组合、条件表达式和递归定义从它们构建一个广泛的函数类,我们将给出许多例子。然后,我们将展示这些函数本身如何表达为符号表达式,并且我们将定义一个通用函数apply,它允许我们根据给定函数的表达式计算给定参数的值。最后,我们将定义一些以函数为参数的函数,并给出一些有用的例子。
We shall first define a class of symbolic expressions in terms of ordered pairs and lists. Then we shall define five elementary functions and predicates, and build from them by composition, conditional expressions, and recursive definitions an extensive class of functions of which we shall give a number of examples. We shall then show how these functions themselves can be expressed as symbolic expressions, and we shall define a universal function apply that allows us to compute from the expression for a given function its value for given arguments. Finally, we shall define some functions with functions as arguments and give some useful examples.
A。一类符号表达式。我们现在将定义 S 表达式(S 代表符号)。它们是使用特殊字符组成的
a. A Class of Symbolic Expressions. We shall now define the S-expressions (S stands for symbolic). They are formed by using the special characters
。
.
(
(
)
)
以及无限组可区分的原子符号。对于原子符号,我们将使用大写拉丁字母和带有单个嵌入空格的数字字符串。原子符号的例子是
and an infinite set of distinguishable atomic symbols. For atomic symbols, we shall use strings of capital Latin letters and digits with single imbedded blanks. Examples of atomic symbols are
A
A
ABA
ABA
苹果派 3 号
APPLE PIE NUMBER 3
偏离使用单个字母表示原子符号的通常数学实践有双重原因。首先,计算机程序经常需要数百个可区分的符号,这些符号必须由 IBM 704 计算机可打印的 47 个字符组成。其次,出于助记的原因,允许英语单词和短语代表原子实体是很方便的。这些符号是原子的,因为它们作为字符序列可能具有的任何子结构都会被忽略。我们仅假设可以区分不同的符号。
There is a twofold reason for departing from the usual mathematical practice of using single letters for atomic symbols. First, computer programs frequently require hundreds of distinguishable symbols that must be formed from the 47 characters that are printable by the IBM 704 computer. Second, it is convenient to allow English words and phrases to stand for atomic entities for mnemonic reasons. The symbols are atomic in the sense that any substructure they may have as sequences of characters is ignored. We assume only that different symbols can be distinguished.
S 表达式定义如下:
S-expressions are then defined as follows:
1. 原子符号是S表达式。
1. Atomic symbols are S-expressions.
2. 如果e 1和e 2是S表达式,则( e 1 · e 2 )也是S表达式。
2. If e1 and e2 are S-expressions, so is (e1 · e2).
S 表达式的示例是
Examples of S-expressions are
AB
AB
(甲·乙)
(A · B)
((A B C D)
((AB · C) · D)
S 表达式只是一个有序对,其项可以是原子符号或更简单的 S 表达式。我们可以用 S 表达式来表示任意长度的列表,如下所示。列表 ( m 1 , m 2 , ⋯ , m n ) 由 S 表达式表示
An S-expression is then simply an ordered pair, the terms of which may be atomic symbols or simpler S-expressions. We can represent a list of arbitrary length in terms of S-expressions as follows. The list (m1, m2, ⋯ , mn) is represented by the S-expression
其中 NIL 是用于终止列表的原子符号。
where NIL is an atomic symbol used to terminate lists.
由于我们处理的许多符号表达式都可以方便地表示为列表,因此我们将引入列表表示法来缩写某些 S 表达式。我们有
Since many of the symbolic expressions with which we deal are conveniently expressed as lists, we shall introduce a list notation to abbreviate certain S-expressions. We have
1. ( m ) 代表 ( m · NIL)。
1. (m) stands for (m · NIL).
2. ( m 1 , ⋯ , m n ) 代表( m 1 · ( ⋯ ( m n · NIL) ⋯ ))。
2. (m1, ⋯ , mn) stands for (m1 · (⋯(mn · NIL)⋯ )).
3. ( m 1 , ⋯ , m n · x ) 代表( m 1 · ( ⋯ ( m n · x ) ⋯ ))。
3. (m1, ⋯ , mn · x) stands for (m1 · (⋯(mn · x)⋯ )).
子表达式可以类似地缩写。这些缩写的一些例子是
Subexpressions can be similarly abbreviated. Some examples of these abbreviations are
((AB,C),D)对于((AB·(C·NIL))·(D·NIL))
((AB, C), D) for ((AB · (C · NIL)) · (D · NIL))
((A,B),C,D·E)对于((A·(B·NIL))·(C·(D·E)))
((A, B), C, D · E) for ((A · (B · NIL)) · (C · (D · E)))
由于我们将带逗号的表达式视为不包含逗号的表达式的缩写,因此我们将它们统称为 S 表达式。
Since we regard the expressions with commas as abbreviations for those not involving commas, we shall refer to them all as S-expressions.
b. S 表达式的函数以及表示它们的表达式。我们现在定义一类 S 表达式的函数。表示这些函数的表达式以传统的函数表示法编写。然而,为了清楚地区分代表函数的表达式和 S 表达式,我们将使用小写字母序列来表示 S 表达式集合上的函数名称和变量。我们还使用方括号和分号,而不是圆括号和逗号,来表示函数对其参数的应用。因此我们写
b. Functions of S-expressions and the Expressions That Represent Them. We now define a class of functions of S-expressions. The expressions representing these functions are written in a conventional functional notation. However, in order to clearly distinguish the expressions representing functions from S-expressions, we shall use sequences of lower-case letters for function names and variables ranging over the set of S-expressions. We also use brackets and semicolons, instead of parentheses and commas, for denoting the application of functions to their arguments. Thus we write
汽车 [x]
car [x]
汽车[缺点[(A·B); X]]
car [cons [(A · B); x]]
在这些 M 表达式(元表达式)中,出现的任何 S 表达式都代表其自身。
In these M-expressions (meta-expressions) any S-expressions that occur stand for themselves.
C。基本 S 函数和谓词。我们引入以下函数和谓词:
c. The Elementary S-functions and Predicates. We introduce the following functions and predicates:
1. 原子。原子[x]的值为T或F,相应地x是否是原子符号。因此
1. atom. atom [x] has the value of T or F, accordingly as x is an atomic symbol or not. Thus
原子 [X] = T
atom [X] = T
原子 [(X·A)] = F
atom [(X · A)] = F
2. 等式 等式[x; y] 当且仅当 x 和 y 都是原子时才被定义。等式[x; y] = T 如果 x 和 y 是相同的符号,并且 eq[x; y] = F,否则。因此
2. eq. eq[x; y] is defined if and only if both x and y are atomic. eq[x; y] = T if x and y are the same symbol, and eq[x; y] = F otherwise. Thus
等式[X; X] = T
eq[X; X] = T
等式[X; A] = F
eq[X; A] = F
等式[X; (X·A)]未定义。
eq[X; (X · A)] is undefined.
3.汽车。car[x] 当且仅当 x 不是原子时才被定义。汽车[( e 1 · e 2 )] = e 1。因此 car[X] 是未定义的。
3. car. car[x] is defined if and only if x is not atomic. car[(e1 · e2)] = e1. Thus car[X] is undefined.
汽车[(X·A)] = X
car[(X · A)] = X
汽车[((X·A)·Y)] = (X·A)
car[((X · A) · Y)] = (X · A)
4. cdr. 当 x 不是原子时,也定义 cdr[x]。我们有 cdr[( e 1 · e 2 )] = e 2。因此 cdr[X] 是未定义的。
4. cdr. cdr[x] is also defined when x is not atomic. We have cdr[(e1 · e2)] = e2. Thus cdr[X] is undefined.
cdr[(X·A)] = A
cdr[(X · A)] = A
cdr[((X·A)·Y)] = Y
cdr[((X · A) · Y)] = Y
5.缺点。缺点[x; y] 是针对任何 x 和 y 定义的。我们有 cons[ e 1 ; e 2 ] = ( e 1 · e 2 )。因此
5. cons. cons[x; y] is defined for any x and y. We have cons[e1; e2] = (e1 ·e2). Thus
缺点[X; A] = (X·A)
cons[X; A] = (X · A)
缺点[(X·A); Y] = ((X·A)·Y)
cons[(X · A); Y] = ((X · A) · Y)
car、cdr 和 cons 很容易看出满足关系
car, cdr, and cons are easily seen to satisfy the relations
汽车[缺点[x; y]] = x
car[cons[x; y]] = x
cdr[cons[x;;y]] = y
cdr[cons[x; y]] = y
缺点[汽车[x];cdr[x]] = x,前提是 x 不是原子的。
cons[car[x]; cdr[x]] = x, provide that x is not atomic.
只有当我们讨论系统在计算机中的表示时,名称“car”和“cons”才会具有助记意义。car 和 cdr 的组合给出给定表达式在给定位置的子表达式。cons 的组合形成了给定结构的成对表达式。可以用这种方式形成的函数类别非常有限并且不是很有趣。
The names “car” and “cons” will come to have mnemonic significance only when we discuss the representation of the system in the computer. Compositions of car and cdr give the subexpressions of a given expression in a given position. Compositions of cons form expressions of a given structure out of pairs. The class of functions which can be formed in this way is quite limited and not very interesting.
d. 递归 S 函数。当我们允许自己通过条件表达式和递归定义形成 S 表达式的新函数时,我们会得到更大的函数类(实际上,所有可计算函数)。
d. Recursive S-functions. We get a much larger class of functions (in fact, all computable functions) when we allow ourselves to form new functions of S-expressions by conditional expressions and recursive definition.
现在我们给出一些可以用这种方式定义的函数的例子。
We now give some examples of functions that are definable in this way.
ff[x]。ff[x] 的值是 S 表达式 x 的第一个原子符号,忽略括号。因此
ff[x]. The value of ff[x] is the first atomic symbol of the S-expression x with the parentheses ignored. Thus
ff[((A·B)·C)] = A
ff[((A · B) · C)] = A
我们有
We have
ff[x] = [原子[x] → x; t → ff[车[x]]]
ff[x] = [atom[x] → x; t → ff[car[x]]]
[编辑:省略其他函数和谓词定义和示例。]
[EDITOR: Other function and predicate definitions and examples omitted.]
……
…
F。通用 S 功能适用。存在一个 S 函数,其属性为:如果 f 是 S 函数 f′ 的 S 表达式,并且 args 是 (argl, ⋯ , argn)形式的参数列表,其中 arg1, ⋯ , argn 是任意 S 表达式,然后 apply[f; args] 和 f′[arg1; ……;argn] 被定义为与 arg1、 ⋯ 、argn相同的值,并且定义时相等。例如,
f. The Universal S-Function apply. There is an S-function apply with the property that if f is an S-expression for an S-function f′ and args is a list of arguments of the form (argl, ⋯, argn), where arg1, ⋯, argn are arbitrary S-expressions, then apply[f; args] and f′[arg1; ⋯; argn] are defined for the same values of arg1, ⋯, argn, and are equal when defined. For example,
λ [[x; y];缺点[汽车[x];y]][(A,B); (光盘)]
λ[[x; y]; cons[car[x]; y]][(A, B); (C, D)]
= 应用[(LAMBDA, (X, Y), (CONS, (CAR X), Y)) ((A, B), (C, D))]
= apply[(LAMBDA, (X, Y), (CONS, (CAR X), Y)) ((A, B), (C, D))]
=(A,C,D)
= (A, C, D)
……
…
G。以函数作为参数的函数。有许多有用的函数,其中一些参数是函数。它们在定义其他函数时特别有用。此类函数之一是 maplist [x; f] 带有一个 S 表达式参数 x 和一个参数 f,该参数 f 是从 S 表达式到 S 表达式的函数。我们定义
g. Functions with Functions as Arguments. There are a number of useful functions some of whose arguments are functions. They are especially useful in defining other functions. One such function is maplist [x; f] with an S-expression argument x and an argument f that is a function from S-expressions to S-expressions. We define
地图列表 [x; f] = [null [x] → NIL; T → cons [f[x]; 地图列表 [cdr [x]; F]]]
maplist [x; f] = [null [x] → NIL; T → cons [f[x]; maplist [cdr [x]; f]]]
maplist 的有用性通过涉及 x 和其他变量的和与积的表达式的 x 的偏导数公式来说明。我们要微分的 S 表达式的构成如下。
The usefulness of maplist is illustrated by formulas for the partial derivative with respect to x of expressions involving sums and products of x and other variables. The S-expressions that we shall differentiate are formed as follows.
1. 原子符号是允许的表达式。
1. An atomic symbol is an allowed expression.
2. 如果e 1 , ⋯ , en是允许的表达式,则 (PLUS, e 1 , ⋯ , en ) 和 (TIMES, e 1 , ⋯ , en )也是,并且分别表示e 1 , … , en。
2. If e1, ⋯ , en are allowed expressions, (PLUS, e1, ⋯ , en) and (TIMES, e1, ⋯ , en) are also, and represent the sum and product, respectively, of e1, ⋯ , en.
这本质上是函数的波兰表示法,只不过包含括号和逗号允许函数具有可变数量的参数。允许的表达式的示例是 (TIMES, X, (PLUS, X, A), Y),其传统代数表示法是 X(X+A)Y。
This is, essentially, the Polish notation for functions except that the inclusion of parentheses and commas allows functions of variable numbers of arguments. An example of an allowed expression is (TIMES, X, (PLUS, X, A), Y), the conventional algebraic notation for which is X(X+A)Y.
我们的微分公式给出了 y 对 x 的导数,是
Our differentiation formula, which gives the derivative of y with respect to x, is
差异[y; x] = [原子[y] → [eq[y; x] →一;T →零];
diff[y; x] = [atom[y] → [eq[y; x] → ONE; T → ZERO];
eq[车[y]; 加] →缺点[加;地图列表[cdr[y]; λ [[z]; 差异[汽车[z]; X]]]];
eq[car[y]; PLUS] → cons[PLUS; maplist[cdr[y]; λ[[z]; diff[car[z]; x]]]];
eq[车[y]; 次] →缺点[加;地图列表[cdr[y]; λ [[z]; 缺点[次;
eq[car[y]; TIMES] → cons[PLUS; maplist[cdr[y]; λ[[z]; cons[TIMES;
地图列表[cdr[y]; λ [[w]; ∼ eq[z; w] →汽车[w]; T → diff[汽车[[w]; X]]]]]]]]]
maplist[cdr[y]; λ[[w]; ∼ eq[z; w] → car[w]; T → diff[car[[w]; x]]]]]]]]]
按此公式计算的允许表达式的导数为
The derivative of the allowed expression, as computed by this formula, is
(加,(倍,一,(加,X,A),Y),(倍,X,(加,一,零),Y),
(PLUS, (TIMES, ONE, (PLUS, X, A), Y), (TIMES, X, (PLUS, ONE, ZERO), Y),
(次,X,(加,X,A),零))
(TIMES, X, (PLUS, X, A), ZERO))
……
…
L ISP编程系统是使用IBM 704计算机以S表达式形式的符号信息进行计算的系统。它已经或将被用于以下目的:
The LISP programming system is a system for using the IBM 704 computer to compute with symbolic information in the form of S-expressions. It has been or will be used for the following purposes:
1. 编写一个编译器,将L ISP程序编译成机器语言。
1. Writing a compiler to compile LISP programs into machine language.
2. 编写一个程序来检查一类形式逻辑系统中的证明。
2. Writing a program to check proofs in a class of formal logical systems.
3. 编写形式化微分和积分程序。
3. Writing programs for formal differentiation and integration.
4. 编写程序实现谓词演算中生成证明的各种算法。
4. Writing programs to realize various algorithms for generating proofs in predicate calculus.
5. 进行某些工程计算,其结果是公式而不是数字。
5. Making certain engineering calculations whose results are formulas rather than numbers.
6. 对Advice Taker 系统进行编程。
6. Programming the Advice Taker system.
该系统的基础是一种编写计算机程序来评估 S-Function 的方法。这将在以下各节中进行描述。
The basis of the system is a way of writing computer programs to evaluate S-functions. This will be described in the following sections.
除了描述 S-Function 的工具之外,还有在按照 F ORTRAN (IBM, 1956) 或 A LGOL (Perlis and Samelson, 1958) 的语句序列编写的程序中使用 S-Function 的工具。本文不会描述这些功能。
In addition to the facilities for describing S-functions, there are facilities for using S-functions in programs written as sequences of statements along the lines of FORTRAN (IBM, 1956) or ALGOL (Perlis and Samelson, 1958). These features will not be described in this article.
A。用列表结构表示 S 表达式。列表结构是如图 21.1a或21.1b所示排列的计算机单词的集合。列表结构的每个单词都由图中细分的矩形之一表示。矩形的左框代表字的地址字段,右框代表减量字段。从一个框到另一个矩形的箭头表示该框对应的字段包含另一个矩形对应的单词的位置。[编辑:IBM 704 提供了方便的工具来操作“寄存器地址字段的内容”(car) 和“寄存器递减字段的内容”(cdr)。在列表结构的每个字中,早期的 L ISP实现将该字的两个指针(指向列表结构的其他字)放置在正确的位位置以利用这些设施。]
a. Representation of S-Expressions by List Structure. List structure is a collection of computer words arranged as in Figure 21.1a or 21.1b. Each word of the list structure is represented by one of the subdivided rectangles in the figure. The left box of a rectangle represents the address field of the word and the right box represents the decrement field. An arrow from a box to another rectangle means that the field corresponding to the box contains the location of the word corresponding to the other rectangle. [EDITOR: The IBM 704 provided convenient facilities for manipulating the “Contents of the Address field of the Register” (car) and the “Contents of the Decrement field of the Register” (cdr). Within each word of a list structure, early LISP implementations placed that word’s two pointers (to other words of the list structure) in the right bit positions to take advantage of those facilities.]
允许子结构出现在列表结构中的多个位置,如图21.1b所示,但不允许结构有循环,如图21.1c所示。
It is permitted for a substructure to occur in more than one place in a list structure, as in Figure 21.1b, but it is not permitted for a structure to have cycles, as in Figure 21.1c.
原子符号在计算机中由特殊形式的列表结构表示,称为符号关联列表。第一个字的地址字段包含一个特殊的常量,使程序能够知道该字代表一个原子符号。……
An atomic symbol is represented in the computer by a list structure of special form called the association list of the symbol. The address field of the first word contains a special constant which enables the program to tell that this word represents an atomic symbol. …
非原子的 S 表达式 x 由一个字表示,其地址和减量部分分别包含子表达式 car[x] 和 cdr[x] 的位置。……
An S-expression x that is not atomic is represented by a word, the address and decrement parts of which contain the locations of the subexpressions car[x] and cdr[x], respectively. …
使用列表结构存储符号表达式的优点是:
The advantages of list structures for the storage of symbolic expressions are:
1. 程序必须处理的表达式的大小甚至数量无法提前预测。因此,很难安排固定长度的存储块来容纳它们。
1. The size and even the number of expressions with which the program will have to deal cannot be predicted in advance. Therefore, it is difficult to arrange blocks of storage of fixed length to contain them.
2. 当不再需要寄存器时,可以将它们放回到空闲存储列表中。即使返回到列表的一个寄存器也是有价值的,但是如果表达式是线性存储的,则很难利用可能变得可用的奇数大小的寄存器块。
2. Registers can be put back on the free-storage list when they are no longer needed. Even one register returned to the list is of value, but if expressions are stored linearly, it is difficult to make use of blocks of registers of odd sizes that may become available.
3. 作为多个表达式的子表达式出现的表达式只需在存储中表示一次。
3. An expression that occurs as a subexpression of several expressions need be represented in storage only once.
……
…
e. 可用存储列表。在任何给定时间,只有一部分为列表结构保留的内存实际上会用于存储 S 表达式。其余寄存器(在我们的系统中,最初的数量约为 15,000 个)排列在一个称为空闲存储列表的列表中。程序中的某个寄存器 FREE 包含该列表中第一个寄存器的位置。当需要一个字来形成一些附加列表结构时,就取出自由存储列表上的第一个字,并将寄存器FREE中的数字更改为自由存储列表上第二个字的位置。不需要为用户编程将寄存器返回到空闲存储列表。
e. Free-Storage List. At any given time only a part of the memory reserved for list structures will actually be in use for storing S-expressions. The remaining registers (in our system the number, initially, is approximately 15,000) are arranged in a single list called the free-storage list. A certain register, FREE, in the program contains the location of the first register in this list. When a word is required to form some additional list structure, the first word on the free-storage list is taken and the number in register FREE is changed to become the location of the second word on the free-storage list. No provision need be made for the user to program the return of registers to the free-storage list.
这个返回是自动发生的,大致如下(有必要在本报告中给出这个过程的简化描述): 程序中有一组固定的基址寄存器,其中包含程序可以访问的列表结构的位置。当然,由于列表结构分支,因此可能涉及任意数量的寄存器。程序可访问的每个寄存器都是可访问的,因为它可以通过一系列 car 和 cdr 操作从一个或多个基址寄存器到达。当基址寄存器的内容改变时,可能会发生该基址寄存器先前指向的寄存器不能被任何基址寄存器的car-cdr链到达的情况。这样的寄存器可能会被认为被程序放弃,因为任何可能的程序都无法再找到它的内容;因此,它的内容不再令人感兴趣,因此我们希望将其重新放在免费存储列表中。这是通过以下方式实现的。
This return takes place automatically, approximately as follows (it is necessary to give a simplified description of this process in this report): There is a fixed set of base registers in the program which contains the locations of list structures that are accessible to the program. Of course, because list structures branch, an arbitrary number of registers may be involved. Each register that is accessible to the program is accessible because it can be reached from one or more of the base registers by a chain of car and cdr operations. When the contents of a base register are changed, it may happen that the register to which the base register formerly pointed cannot be reached by a car–cdr chain from any base register. Such a register may be considered abandoned by the program because its contents can no longer be found by any possible program; hence its contents are no longer of interest, and so we would like to have it back on the free-storage list. This comes about in the following way.
在程序耗尽可用存储空间之前不会发生任何事情。当需要空闲寄存器并且空闲存储列表中没有剩余寄存器时,回收周期开始。首先,程序找到可从基址寄存器访问的所有寄存器,并使它们的符号为负。这是通过从每个基址寄存器开始并更改 car-cdr 链可以从其到达的每个寄存器的符号来完成的。如果程序在此过程中遇到一个已经有负号的寄存器,则假定已经到达该寄存器。
Nothing happens until the program runs out of free storage. When a free register is wanted, and there is none left on the free-storage list, a reclamation cycle starts. First, the program finds all registers accessible from the base registers and makes their signs negative. This is accomplished by starting from each of the base registers and changing the sign of every register that can be reached from it by a car-cdr chain. If the program encounters a register in this process which already has a negative sign, it assumes that this register has already been reached.
当所有可访问的寄存器的符号都改变后,程序遍历为存储列表结构而保留的内存区域,并将上一步中符号未改变的所有寄存器放回空闲存储列表中,并且使可访问寄存器的符号再次变为正值。
After all of the accessible registers have had their signs changed, the program goes through the area of memory reserved for the storage of list structures and puts all the registers whose signs were not changed in the previous step back on the free-storage list, and makes the signs of the accessible registers positive again.
由于这个过程是完全自动的,因此对于程序员来说比必须跟踪和删除不需要的列表的系统更方便。它的效率取决于可访问列表是否接近耗尽可用内存。这是因为回收过程需要几秒钟的时间来执行,因此如果程序不将大部分时间花在回收上,则必须将至少数千个寄存器添加到空闲存储列表中。……
This process, because it is entirely automatic, is more convenient for the programmer than a system in which he has to keep track of and erase unwanted lists. Its efficiency depends upon not coming close to exhausting the available memory with accessible lists. This is because the reclamation process requires several seconds to execute, and therefore must result in the addition of at least several thousand registers to the free-storage list if the program is not to spend most of its time in reclamation. …
经计算机协会许可,由 McCarthy (1960) 转载。
Reprinted from McCarthy (1960), with permission from the Association for Computing Machinery.
第二次世界大战结束时,道格·恩格尔巴特(Doug Engelbart,1925-2013 年)在南太平洋海军驻扎期间阅读了布什的《正如我们所想》(第 11 章)。这看起来肯定像科幻小说,但他整个职业生涯都在努力让布什的愿景更加具体化。
Doug Engelbart (1925–2013) read Bush’s “As We May Think” (chapter 11) while stationed with the Navy in the South Pacific at the end of World War II. It must have seemed like science fiction, but he spent his career trying to put flesh on the bones of Bush’s vision.
恩格尔巴特的“增强人类智力”项目主要在加利福尼亚州门洛帕克的 SRI(原斯坦福研究所)进行。该项目涵盖人与人之间的协作以及人与计算机之间的交互,以增强人类的智力能力。这一选择是该项目长期的早期大纲的一部分。恩格尔巴特的工作从未被视为主流,即使是在高度创新的 SRI 中,但它属于 ARPA 的研究经费的保护范围,其中一些是在 JCR Licklider 的指导下。与许多大胆的技术投资一样,该项目的某些部分从未落地(例如,一个单手键盘,其中五个手指,每个手指按下一个单独的魔杖,可以放置在 31 个不同的位置,足以输入罗马字母表的字母)。其他因素具有巨大的影响力(例如鼠标)。其他的,例如分布式、协作工作流程的想法,极大地影响了后来的发展,或者被重新发现为后来的发展的一个方面。
Engelbart’s “augmented human intellect” project was carried out mostly at SRI in Menlo Park, California (originally the Stanford Research Institute). The project encompassed collaboration between humans and interactivity between humans and computers, in order to amplify human intellectual capacity. This selection is a part of a long, early outline of the project. Engelbart’s work was never considered mainstream, even at the highly innovative SRI, but it fell under the umbrella of research funding flowing from ARPA, some of it under the direction of J. C. R. Licklider. Like many daring technological investments, parts of the project never took hold (for example, a one-handed keyboard on which five fingers, each depressing a separate wand, could be placed in 31 different positions, enough to input the letters of the Roman alphabet). Others had enormous influence (such as the mouse). And others, such as the idea of distributed, collaborative workflows, greatly influenced—or were rediscovered as aspects of—later developments.
恩格尔巴特在 1968 年著名的演示中既抽象又具体化了利克莱德的愿景,该演示后来被称为“所有演示之母”。它在旧金山举行的秋季联合计算机会议上凭借许多现已变得司空见惯的交互技术让观众眼花缭乱:鼠标、网络、视频会议、超链接、协作文本编辑和窗口等。
Engelbart both abstracted and concretized Licklider’s vision in a famous 1968 demonstration that came to be known as “the mother of all demos.” It dazzled the audience at the Fall Joint Computer Conference in San Francisco with many interactive technologies that have now become commonplace: the mouse, networking, videoconferencing, hyperlinks, collaborative text editing, and windowing, among others.
十年后,SRI 将恩格尔巴特的实验室卖给了一家营利性企业,但该实验室并没有蓬勃发展,部分原因是恩格尔巴特坚定的特质,部分原因是蓬勃发展的个人计算机和网络行业正在独立改变计算机的使用方式。尽管恩格尔巴特技术精湛,但他在精神上更致力于人类的进步。他是 20 世纪 60 年代的产物,随着 1970 年代的到来,商业化也随之而来。1997 年,恩格尔巴特因“对交互式计算未来的鼓舞人心的愿景以及帮助实现这一愿景的关键技术的发明”而获得了图灵奖。
A decade later SRI sold Engelbart’s laboratory to a for-profit business, where it did not flourish, in part because of Engelbart’s determined idiosyncrasies, and in part because the booming personal computer and networking industries were independently changing the way computers were being used. For all his technical wizardry, Engelbart was more than anything spiritually committed to human improvement. He was a creature of the 1960s, and with the advent of the 1970s commercialization was coming. Engelbart received the Turing Award in 1997 “for an inspiring vision of the future of interactive computing and the invention of key technologies to help realize this vision.”
我们所说的“增强人类智力”是指提高一个人处理复杂问题情况的能力,获得理解以满足他的特殊需求,并得出问题解决方案的能力。在这方面的能力增强意味着以下各项的混合:更快速的理解、更好的理解、在以前过于复杂的情况下获得有用程度的理解的可能性、更快的解决方案、更好的解决方案以及可能性寻找解决以前似乎无法解决的问题的方法。通过“复杂情况”,我们包括外交官、高管、社会科学家、生命科学家、物理科学家、律师、设计师的专业问题——无论问题情况存在二十分钟还是二十年。我们并不是在谈论在特定情况下有用的孤立的聪明技巧。我们指的是一种综合领域中的生活方式,其中预感、尝试、无形资产和人类“对情况的感觉”与强大的概念、简化的术语和符号、复杂的方法和高水平有效地共存。动力电子辅助设备。
BY “augmenting human intellect” we mean increasing the capability of a man to approach a complex problem situation, to gain comprehension to suit his particular needs, and to derive solutions to problems. Increased capability in this respect is taken to mean a mixture of the following: more-rapid comprehension, better comprehension, the possibility of gaining a useful degree of comprehension in a situation that previously was too complex, speedier solutions, better solutions, and the possibility of finding solutions to problems that before seemed insoluble. And by “complex situations” we include the professional problems of diplomats, executives, social scientists, life scientists, physical scientists, attorneys, designers—whether the problem situation exists for twenty minutes or twenty years. We do not speak of isolated clever tricks that help in particular situations. We refer to a way of life in an integrated domain where hunches, cut-and-try, intangibles, and the human “feel for a situation” usefully co-exist with powerful concepts, streamlined terminology and notation, sophisticated methods, and high-powered electronic aids.
人类的人口和总产值正在以相当大的速度增长,但问题的复杂性增长得更快,而且随着活动率的增加和活动日益全球化的性质,必须找到解决方案的紧迫性也变得越来越迫切。如果能够提出合理的方法和一些看似合理的好处,那么在上述定义的意义上增强人类的智力将值得一个开明的社会充分追求。
Man’s population and gross product are increasing at a considerable rate, but the complexity of his problems grows still faster, and the urgency with which solutions must be found becomes steadily greater in response to the increased rate of activity and the increasingly global nature of that activity. Augmenting man’s intellect, in the sense defined above, would warrant full pursuit by an enlightened society if there could be shown a reasonable approach and some plausible benefits.
本报告涵盖了旨在开发增强人类智力的方法的计划的第一阶段。这些“手段”可以包括很多东西——所有这些似乎都只是过去开发和使用的手段的延伸,以帮助人类运用其天生的感觉、心理和运动能力——我们考虑人类和他的整个系统。增强意味着作为寻找实际可能性的适当领域。它对我们的社会来说是一个非常重要的系统,与大多数系统一样,通过将整体视为一组相互作用的组件而不是孤立地考虑组件,可以最好地提高其性能。
This report covers the first phase of a program aimed at developing means to augment the human intellect. These “means” can include many things—all of which appear to be but extensions of means developed and used in the past to help man apply his native sensory, mental, and motor capabilities—and we consider the whole system of a human and his augmentation means as a proper field of search for practical possibilities. It is a very important system to our society, and like most systems its performance can best be improved by considering the whole as a set of interacting components rather than by considering the components in isolation.
这种研究人类智力有效性的系统方法并没有找到像现有学科那样的现成概念框架。在设计一个研究计划来明智地追求这种方法之前,以便在合理的时间内获得实际效益,同时产生具有长期意义的结果,必须找到一个概念框架——一个为重要研究提供方向的框架。系统因素、这些因素之间的关系、系统因素之间可能带来性能改进的变化类型,以及看起来有希望的研究目标和方法。
This kind of system approach to human intellectual effectiveness does not find a ready-made conceptual framework such as exists for established disciplines. Before a research program can be designed to pursue such an approach intelligently, so that practical benefits might be derived within a reasonable time while also producing results of longrange significance, a conceptual framework must be searched out—a framework that provides orientation as to the important factors of the system, the relationships among these factors, the types of change among the system factors that offer likely improvements in performance, and the sort of research goals and methodology that seem promising.
在我们计划的第一(搜索)阶段,我们开发了一个概念框架,似乎满足当前设计研究阶段的需求。§ 22.2包含了这个框架的本质,它源自于看待由人类及其智力增强手段组成的系统的几种不同方式。
In the first (search) phase of our program we have developed a conceptual framework that seems satisfactory for the current needs of designing a research phase. §22.2 contains the essence of this framework as derived from several different ways of looking at the system made up of a human and his intellect-augmentation means.
开发这个概念框架的过程带来了许多重要的认识:今天,一个特定的人所发挥的智力有效性不太可能受到智力的限制——工程、数学、社会、生活等领域有数十个学科。以及有助于改进智力增强手段系统的物理科学;任何一项此类改进都有望引发一系列协调改进;在这些学科中的每一个都陷入停滞并且我们用尽了我们可以从中收集到的所有改进可能性之前,我们可以期望继续改进这个人类智力系统;与自骑马和帆船时代以来个人地理流动性相比,没有特别的理由不期望通过协调一致的面向系统的方法来提高个人智力效率。……
The process of developing this conceptual framework brought out a number of significant realizations: that the intellectual effectiveness exercised today by a given human has little likelihood of being intelligence limited—that there are dozens of disciplines in engineering, mathematics, and the social, life, and physical sciences that can contribute improvements to the system of intellect-augmentation means; that any one such improvement can be expected to trigger a chain of coordinating improvements; that until every one of these disciplines comes to a standstill and we have exhausted all the improvement possibilities we could glean from it, we can expect to continue to develop improvements in this human-intellect system; that there is no particular reason not to expect gains in personal intellectual effectiveness from a concerted system-oriented approach that compare to those made in personal geographic mobility since horseback and sailboat days. …
让我们考虑一下工作中的“增强型”建筑师。他坐在一个工作站旁,工作站的一侧有一个约三英尺的可视显示屏;这是他的工作表面,由计算机(他的“职员”)控制,他可以通过小键盘和各种其他设备与计算机进行通信。
Let us consider an “augmented” architect at work. He sits at a working station that has a visual display screen some three feet on a side; this is his working surface, and is controlled by a computer (his “clerk”) with which he can communicate by means of a small keyboard and various other devices.
他正在设计一座建筑。他已经构思出了几种基本的布局和结构形式,并正在屏幕上进行尝试。他现在正在做的布局的测量数据已经输入了,他刚刚哄骗店员给他看了一张陡峭的山坡建筑工地的透视图,上面是道路,以及要保留的各种树木的象征性表现地段上以及不同公用事业的服务连接点。该视图占据屏幕左侧的三分之二。他用“指针”指示两个感兴趣的点,然后在键盘上快速移动左手,所指示的点之间的距离和高度就会出现在屏幕的右侧三分之一处。
He is designing a building. He has already dreamed up several basic layouts and structural forms, and is trying them out on the screen. The surveying data for the layout he is working on now have already been entered, and he has just coaxed the clerk to show him a perspective view of the steep hillside building site with the roadway above, symbolic representations of the various trees that are to remain on the lot, and the service tie points for the different utilities. The view occupies the left two-thirds of the screen. With a “pointer,” he indicates two points of interest, moves his left hand rapidly over the keyboard, and the distance and elevation between the points indicated appear on the right-hand third of the screen.
现在,他用指针和键盘输入参考线。渐渐地,屏幕开始显示他正在做的工作——山坡上出现了一个整齐的挖掘坑,稍微修正了一下,又修正了一遍。过了一会儿,建筑师将屏幕上的场景更改为现场俯视图,仍然显示挖掘情况。经过几分钟的学习,他在键盘上输入了一系列项目,检查屏幕上出现的每一项,以便稍后学习。
Now he enters a reference line with his pointer, and the keyboard. Gradually the screen begins to show the work he is doing—a neat excavation appears in the hillside, revises itself slightly, and revises itself again. After a moment, the architect changes the scene on the screen to an overhead plan view of the site, still showing the excavation. A few minutes of study, and he enters on the keyboard a list of items, checking each one as it appears on the screen, to be studied later.
忽略显示器上的表示,建筑师接下来开始输入一系列规格和数据 - 六英寸的板楼,挖掘区域内八英尺高的十二英寸混凝土墙,等等。当他完成后,修改后的场景出现在屏幕上。一个结构正在形成。他检查它,调整它,停顿足够长的时间,在不同的地方向店员询问手册或目录信息,然后相应地重新调整。他经常从“职员”那里回忆起他的规范和注意事项的工作清单,以供参考、修改或添加。这些列表逐渐形成一个更加详细、相互关联的结构,代表了实际设计背后的成熟思想。
Ignoring the representation on the display, the architect next begins to enter a series of specifications and data—a six-inch slab floor, twelve-inch concrete walls eight feet high within the excavation, and so on. When he has finished, the revised scene appears on the screen. A structure is taking shape. He examines it, adjusts it, pauses long enough to ask for handbook or catalog information from the clerk at various points, and readjusts accordingly. He often recalls from the “clerk” his working lists of specifications and considerations to refer to them, modify them, or add to them. These lists grow into an evermore-detailed, interlinked structure, which represents the maturing thought behind the actual design.
到处规定不同的平面,偶尔规定曲面,并将整个结构移动大约五英尺,最终使建筑物的粗糙外部形式达到很好的平衡与设置,他确信这种形式基本上与所使用的材料以及建筑物的功能兼容。
Prescribing different planes here and there, curved surfaces occasionally, and moving the whole structure about five feet, he finally has the rough external form of the building balanced nicely with the setting and he is assured that this form is basically compatible with the materials to be used as well as with the function of the building.
现在他开始输入有关内部的详细信息。在这里,职员向他展示他想要检查的任何视图(内部的一部分,或者从上面的道路看结构的样子)的能力很重要。他输入特定的灯具设计,并在特定的房间中检查它们。他检查以确保车窗发出的刺眼阳光不会使道路上的驾驶员失明,“职员”会计算仲夏早晨 6 点至 6:30 之间一扇窗户强烈反射到道路上的信息。
Now he begins to enter detailed information about the interior. Here the capability of the clerk to show him any view he wants to examine (a slice of the interior, or how the structure would look from the roadway above) is important. He enters particular fixture designs, and examines them in a particular room. He checks to make sure that sun glare from the windows will not blind a driver on the roadway, and the “clerk” computes the information that one window will reflect strongly onto the roadway between 6 and 6:30 on midsummer mornings.
接下来他开始进行泛函分析。他有一份将入住这座大楼的人员名单,以及他们每天的活动顺序。“职员”允许他轮流跟踪每一扇门,检查门如何转动,哪里可能需要特殊照明。最后,他让“职员”将所有这些活动序列结合起来,指出建筑物内交通繁忙的地方或可能发生拥堵的地方,并确定公用事业消耗最严重的地方可能是什么。
Next he begins a functional analysis. He has a list of the people who will occupy this building, and the daily sequences of their activities. The “clerk” allows him to follow each in turn, examining how doors swing, where special lighting might be needed. Finally he has the “clerk” combine all of these sequences of activity to indicate spots where traffic is heavy in the building, or where congestion might occur, and to determine what the severest drain on the utilities is likely to be.
所有这些信息(建筑设计及其相关的“思想结构”)都可以存储在磁带上,以代表建筑的设计手册。将此磁带加载到他自己的职员中,另一位建筑师、建筑商或客户可以在本设计手册中进行操作,以追求他感兴趣的任何细节或见解,并且可以添加特殊注释,这些注释已集成到他自己的设计手册中或者别人后来的利益。
All of this information (the building design and its associated “thought structure”) can be stored on a tape to represent the design manual for the building. Loading this tape into his own clerk, another architect, a builder, or the client can maneuver within this design manual to pursue whatever details or insights are of interest to him—and can append special notes that are integrated into the design manual for his own or someone else’s later benefit.
在人类问题解决者和计算机“职员”之间的未来工作关系中,只要需要,就会使用计算机执行数学过程的能力。然而,计算机具有许多其他处理和显示信息的能力,这些能力对人类在规划、组织、研究等非数学过程中大有裨益。每个用符号化概念(无论是以符号化概念的形式)进行思考的人英语、象形文字、形式逻辑或数学)应该能够显着受益。
In such a future working relationship between human problem-solver and computer “clerk,” the capability of the computer for executing mathematical processes would be used whenever it was needed. However, the computer has many other capabilities for manipulating and displaying information that can be of significant benefit to the human in nonmathematical processes of planning, organizing, studying, etc. Every person who does his thinking with symbolized concepts (whether in the form of the English language, pictographs, formal logic, or mathematics) should be able to benefit significantly.
每个思想或行动的过程都是由子过程组成的。让我们考虑这样的例子:用铅笔画一笔、写一个字母或制定一个计划。相当多的离散肌肉运动被组织成铅笔笔划的形成;同样,制作特定的铅笔笔画和制定字母的计划本身就是复杂的过程,成为字母字符整体书写的子过程。
Every process of thought or action is made up of sub-processes. Let us consider such examples as making a pencil stroke, writing a letter of the alphabet, or making a plan. Quite a few discrete muscle movements are organized into the making of a pencil stroke; similarly, making particular pencil strokes and making a plan for a letter are complex processes in themselves that become sub-processes to the over-all writing of an alphabetic character.
尽管每个子流程本身就是一个流程,因为它由进一步的子流程组成,但在这里寻找流程的最终底部似乎没有意义 -层次结构。似乎无法判断物理世界或人类理解的局限性是否存在明显的底部(无法进一步细分的过程)。
Although every sub-process is a process in its own right, in that it consists of further sub-processes, there seems to be no point here in looking for the ultimate bottom of the process-hierarchical structure. There seems to be no way of telling whether or not the apparent bottoms (processes that cannot be further subdivided) exist in the physical world or in the limitations of human understanding.
无论如何,没有必要从底层开始讨论特定的流程层次结构。没有人在每次解决新问题时都会使用完全独特的流程。相反,他从一组基本的感觉-心理-运动处理能力开始,并添加了他的制品的某些处理能力。可供借鉴的此类基本人类和人工能力的数量是有限的。此外,甚至完全不同的高阶过程也可能具有共同的相对高阶子过程。
In any case, it is not necessary to begin from the bottom in discussing particular process hierarchies. No person uses a process that is completely unique every time he tackles something new. Instead, he begins from a group of basic sensory-mental-motor process capabilities, and adds to these certain of the process capabilities of his artifacts. There are only a finite number of such basic human and artifact capabilities from which to draw. Furthermore, even quite different higher order processes may have in common relatively high-order sub-processes.
当一个人写散文文本(一个相当高阶的过程)时,他使用许多过程作为其他高阶过程所共有的子过程。例如,他利用计划、作曲、口授。写作过程被用作许多更高级别的不同过程中的子过程,例如组织委员会、改变政策等等。
When a man writes prose text (a reasonably high-order process), he makes use of many processes as sub-processes that are common to other high-order processes. For example, he makes use of planning, composing, dictating. The process of writing is utilized as a sub-process within many different processes of a still higher order, such as organizing a committee, changing a policy, and so on.
那么,会发生的事情是,每个人都开发了一定的流程能力,他从中选择和调整那些将组成他执行的流程的能力。这些技能就像一个工具箱,正如机械师必须知道他的工具可以做什么以及如何使用它们一样,智力工作者也必须知道他的工具的功能,并有良好的方法、策略和经验法则来制作使用它们。个人技能中的所有过程能力最终都依赖于他或他的工件内的基本能力,并且整个技能代表了一种相互交织的层次结构(我们通常称之为技能层次结构)。
What happens, then, is that each individual develops a certain repertoire of process capabilities from which he selects and adapts those that will compose the processes that he executes. This repertoire is like a tool kit, and just as the mechanic must know what his tools can do and how to use them, so the intellectual worker must know the capabilities of his tools and have good methods, strategies, and rules of thumb for making use of them. All of the process capabilities in the individual’s repertoire rest ultimately upon basic capabilities within him or his artifacts, and the entire repertoire represents an inter-knit, hierarchical structure (which we often call the repertoire hierarchy).
我们发现典型个人的流程能力分为三类。有些是完全在人类体表内执行的,我们称之为显式人类处理能力;工件拥有无需人工干预即可执行流程的能力,我们称之为显式工件流程能力;还有我们所说的复合流程能力,它们源自包含其他两种类型的层次结构。
We find three general categories of process capabilities within a typical individual’s repertoire. There are those that are executed completely within the human integument, which we call explicit-human process capabilities; there are those possessed by artifacts for executing processes without human intervention, which we call explicit-artifact process capabilities; and there are what we call the composite process capabilities, which are derived from hierarchies containing both of the other kinds.
我们假设我们的 H-LAM/T 系统(使用语言、工件、方法论的人类,在其中接受过培训)有能力并在任何使用此指令集的情况下执行该过程。让我们看看 LAM/T 成分的流程结构,以便更好地“感受”我们的模型。考虑一下撰写重要备忘录的过程。有一个与此过程相关的特定概念 - 将信息放入正式的包中并将其分发给一组人以出于某种考虑 - 与此概念相关的信息包类型被赋予了特殊名称备忘录。系统语言已经显示了该过程的效果,即概念及其名称。……
We assume that it is our H-LAM/T system (Human using Language, Artifacts, Methodology, in which he is Trained) that has the capability and that performs the process in any instance of use of this repertoire. Let us look within the process structure for the LAM/T ingredients, to get a better “feel” for our models. Consider the process of writing an important memo. There is a particular concept associated with this process—that of putting information into a formal package and distributing it to a set of people for a certain kind of consideration—and the type of information package associated with this concept has been given the special name of memorandum. Already the system language shows the effect of this process—i.e., a concept and its name. …
通过考虑从 H-LAM/T 系统内的基本成分构建的过程能力的全部层次结构,可以有力地支持这种将系统视为一个相互作用的整体的观点。认识到语言、工件或方法中的任何潜在变化仅相对于其在流程中的使用而言才具有重要性,并且该层次结构中任何地方出现的新流程能力可以使对流程的许多其他部分中潜在变化可能性的新考虑变得切实可行。层次结构——语言、工件或方法论的可能性——揭示了这三种增强手段之间的强烈相互关系。
This view of the system as an interacting whole is strongly bolstered by considering the repertoire hierarchy of process capabilities that is structured from the basic ingredients within the H-LAM/T system. The realization that any potential change in language, artifact, or methodology has importance only relative to its use within a process and that a new process capability appearing anywhere within that hierarchy can make practical a new consideration of latent change possibilities in many other parts of the hierarchy—possibilities in either language, artifacts, or methodology—brings out the strong interrelationship of these three augmentation means.
提高个人使用其基本能力的有效性是重新设计系统的可变部分的一个问题。该系统积极参与发展个人理解力和解决问题的连续过程(除其他外);这两个过程都受到人类动机、目的和意志的影响。重新设计系统执行这些过程的能力意味着重新设计全部或部分指令层次结构。要重新设计结构,我们必须尽可能多地了解结构中使用的基本材料和组件的知识;除此之外,我们必须学会如何从功能整体及其目的的角度来看待、衡量、分析和评价。在这种特殊情况下,现有的分析理论本身不足以分析和评估整体系统性能;因此,追求改进的系统需要使用实验方法。
Increasing the effectiveness of the individual’s use of his basic capabilities is a problem in redesigning the changeable parts of a system. The system is actively engaged in the continuous processes (among others) of developing comprehension within the individual and of solving problems; both processes are subject to human motivation, purpose, and will. To redesign the system’s capability for performing these processes means redesigning all or part of the repertoire hierarchy. To redesign a structure, we must learn as much as we can of what is known about the basic materials and components as they are utilized within the structure; beyond that, we must learn how to view, to measure, to analyze, and to evaluate in terms of the functional whole and its purpose. In this particular case, no existing analytic theory is by itself adequate for the purpose of analyzing and evaluating over-all system performance; pursuit of an improved system thus demands the use of experimental methods.
在这次重新设计中,不需要只是添加或修改非常复杂或正式的流程功能。本质上,当今人类代表所使用的任何流程(他在展望一天的工作时所想到的流程)都是涉及外部构成和操作符号(文本、草图、图表、列表、 ETC。)。许多外部创作和操作(修改、重新安排)过程服务于此类典型的“人类”活动,例如玩弄形式和关系来询问发展结果、对一个想法进行反复尝试的发展,或者列出要反思的项目然后随着思想的发展重新排列和扩展它们。
It need not be just the very sophisticated or formal process capabilities that are added or modified in this redesign. Essentially any of the processes utilized by a representative human today—the processes that he thinks of when he looks ahead to his day’s work—are composite processes of the sort that involve external composing and manipulating of symbols (text, sketches, diagrams, lists, etc.). Many of the external composing and manipulating (modifying, rearranging) processes serve such characteristically “human” activities as playing with forms and relationships to ask what develops, cut-and-try multiple-pass development of an idea, or listing items to reflect on and then rearranging and extending them as thoughts develop.
现有的或不久的将来的技术肯定可以为我们专业的问题解决者提供他们所需的工件,以快速且以最少的人力来复制和重新排列眼前的文本。即使是如此明显的微小进步也可能会导致个人技能层次结构的彻底改变,这将代表整体效率的大幅提高。通常情况下,必要的设备会慢慢进入市场;预期的变化很小,人们会一次一点地改变他们的做事方式,只有逐渐地,他们积累的变化才会为更激进的版本创造市场。设备。这样的进化过程是我们剧目层次结构发展和形成的典型方式。
Existing, or near-future, technology could certainly provide our professional problem-solvers with the artifacts they need to have for duplicating and rearranging text before their eyes, quickly and with a minimum of human effort. Even so apparently minor an advance could yield total changes in an individual’s repertoire hierarchy that would represent a great increase in over-all effectiveness. Normally the necessary equipment would enter the market slowly; changes from the expected would be small, people would change their ways of doing things a little at a time, and only gradually would their accumulated changes create markets for more radical versions of the equipment. Such an evolutionary process has been typical of the way our repertoire hierarchies have grown and formed.
但是,旨在探索和评估整个曲目层次结构中可能发生的综合变化的积极研究工作可以大大加速这一演变过程。研究工作可以指导新制品的产品开发,采取长期有意义的步骤;同时具有竞争意识的个人会对已证实的实现更高个人效率的方法做出反应,将为更激进的设备创新创造市场。预计引导进化过程将比传统进化过程快得多。
But an active research effort, aimed at exploring and evaluating possible integrated changes throughout the repertoire hierarchy, could greatly accelerate this evolutionary process. The research effort could guide the product development of new artifacts toward taking long-range meaningful steps; simultaneously competitively minded individuals who would respond to demonstrated methods for achieving greater personal effectiveness would create a market for the more radical equipment innovations. The guided evolutionary process could be expected to be considerably more rapid than the traditional one.
“更激进的创新”类别包括数字计算机作为个人使用的工具。这里不仅保证了在个人眼前编写和重新排列文本和图表的极大灵活性,而且还保证了许多其他处理功能可以集成到 H-LAM/T 系统的指令集层次结构中。
The category of “more radical innovations” includes the digital computer as a tool for the personal use of an individual. Here there is not only promise of great flexibility in the composing and rearranging of text and diagrams before the individual’s eyes but also promise of many other process capabilities that can be integrated into the H-LAM/T system’s repertoire hierarchy.
22.2.3.1 智能的来源 当一个人看到一个正在执行非常复杂工作的计算机系统时,他表面上看到的是一台可以执行一些极其复杂过程的机器。如果他是外行,他对提供这种复杂功能的概念可能会赋予机器一种神秘的力量,通过感知和智能的综合思维设备来扫描信息。实际上,这种复杂的能力源自非常聪明的组织层次结构,因此,在该系统内追求情报来源将通过功能和物理组织的层层递进,而这些组织逐渐变得更加原始。
22.2.3.1 The source of intelligence When one looks at a computer system that is doing a very complex job, he sees on the surface a machine that can execute some extremely sophisticated processes. If he is a layman, his concept of what provides this sophisticated capability may endow the machine with a mysterious power to sweep information through perceptive and intelligent synthetic thinking devices. Actually, this sophisticated capability results from a very clever organizational hierarchy so that pursuit of the source of intelligence within this system would take one down through layers of functional and physical organization that become successively more primitive.
更具体地说,我们可以从顶部开始,列出主要的层次,如果我们依次分解每个层次的功能元素来寻找“情报来源”,我们将经过这些主要层次。程序员可以将我们带入也许三个级别(取决于计算机执行的整个过程的复杂程度),也许用流程图描述每个级别的组织。第一层向下将组织与面向问题的语言(例如,LGOL或COBOL )中的语句相对应的函数,以实现所需的总体过程。下面的第二层将把较小的功能组织到由第一层语句表示的进程中。第三层也许会展示基本的机器命令(或者更确切地说它们所代表的过程)是如何组织来实现第二层的每项功能的。
To be more specific, we can begin at the top and list the major levels down through which we would pass if we successively decomposed the functional elements of each level in search of the “source of intelligence.” A programmer could take us down through perhaps three levels (depending upon the sophistication of the total process being executed by the computer) perhaps depicting the organization at each level with a flow chart. The first level down would organize functions corresponding to statements in a problem-oriented language (e.g., ALGOL or COBOL), to achieve the desired over-all process. The second level down would organize lesser functions into the processes represented by first-level statements. The third level would perhaps show how the basic machine commands (or rather the processes which they represent) were organized to achieve each of the functions of the second level.
然后机器设计师可以接管,并通过计算机组织的框图,他可以向我们展示(第 4 级)如何组织不同的硬件单元(例如,随机存取存储、算术寄存器、加法器、算术控制)以提供执行第 3 级中使用的命令序列的能力。然后,逻辑设计人员可以带我们浏览第 5 级,也使用框图,向我们展示脉冲门、触发器和 AND、OR 等硬件元素如何,并且 NOT 电路可以组织成网络,提供第 4 级使用的功能。对于第 6 级,电路工程师可以向我们展示图表,揭示如何晶体管、电阻器、电容器和二极管等元件可以组织成模块化网络,提供 5 级元件所需的功能。
Then a machine designer could take over, and with a block diagram of the computer’s organization he could show us (Level 4) how the different hardware units (e.g., random-access storage, arithmetic registers, adder, arithmetic control) are organized to provide the capability of executing sequences of the commands used in Level 3. The logic designer could then give us a tour of Level 5, also using block diagrams, to show us how such hardware elements as pulse gates, flip-flops, and AND, OR, and NOT circuits can be organized into networks giving the functions utilized at Level 4. For Level 6 a circuit engineer could show us diagrams revealing how components such as transistors, resistors, capacitors, and diodes can be organized into modular networks that provide the functions needed for the elements of Level 5.
不同类型的设备工程师和物理学家可以带我们进入更多层次。但很快我们就跨越了人为组织和自然组织之间的界限,并最终讨论给定的物理现象是如何从亚原子粒子的内在组织中衍生出来的,我们有能力解释后续各层因我们目前人类理解力的耗尽而受阻。
Device engineers and physicists of different kinds could take us down through more layers. But rather soon we have crossed the boundary between what is man-organized and what is nature-organized, and are ultimately discussing the way in which a given physical phenomenon is derived from the intrinsic organization of sub-atomic particles, with our ability to explain succeeding layers blocked by the exhaustion of our present human comprehension.
如果我们问自己这种智能体现在哪里,我们就被迫承认它难以捉摸地分布在功能过程的层次结构中——这个层次结构的基础延伸到我们理解深度以下的自然过程。如果说这种智力依赖于任何一件事的话,那就是组织。生物学家和生理学家使用术语“协同作用”来表示(Webster,1959)“ ……离散机构的合作行动,使得总效果大于两个独立效果的总和…… ”。这个术语似乎直接适用于此,我们可以说协同是代表实际情报来源的最有可能的候选者
If we then ask ourselves where that intelligence is embodied, we are forced to concede that it is elusively distributed throughout a hierarchy of functional processes—a hierarchy whose foundation extends down into natural processes below the depth of our comprehension. If there is any one thing upon which this intelligence depends, it would seem to be organization. The biologists and physiologists use a term “synergism” to designate (Webster, 1959) the “… cooperative action of discrete agencies such that the total effect is greater than the sum of the two effects taken independently….” This term seems directly applicable here, where we could say that synergism is our most likely candidate for representing the actual source of intelligence
事实上,我们观察到的每一个社会、生活或物理现象似乎都源自有组织的功能(或过程)的支持层次结构,其中协同原则为每个后续的更高层次的组织提供了越来越多的现象学复杂性。尤其是人类的智力,最终源于个体神经细胞的特性,无疑是协同作用的结果。
Actually, each of the social, life, or physical phenomena we observe about us would seem to derive from a supporting hierarchy of organized functions (or processes), in which the synergistic principle gives increased phenomenological sophistication to each succeedingly higher level of organization. In particular, the intelligence of a human being, derived ultimately from the characteristics of individual nerve cells, undoubtedly results from synergism.
22.2.3.2 智力放大 在本研究过程中,有人多次开玩笑地表示,我们正在寻找的是“智力放大器”。(这个术语最初被认为是 W. Ross Ashby [1952, 1956]。)最初这个术语被拒绝,因为我们认为,人类唯一的希望是在现有的人类智能和要解决的问题之间做出更好的匹配。 ,而不是让人类变得更加聪明。但是推导上一节中提出的概念向我们表明,这个术语确实似乎适用于我们的目标。
22.2.3.2 Intelligence amplification It has been jokingly suggested several times during the course of this study that what we are seeking is an “intelligence amplifier.” (The term is attributed originally to W. Ross Ashby [1952, 1956].) At first this term was rejected on the grounds that in our view one’s only hope was to make a better match between existing human intelligence and the problems to be tackled, rather than in making man more intelligent. But deriving the concepts brought out in the preceding section has shown us that indeed this term does seem applicable to our objective.
接受“智力放大”这个术语并不意味着任何增加人类固有智力的尝试。“智力放大”一词似乎适用于我们增强人类智力的目标,因为要产生的实体将比无人辅助的人类表现出更多的所谓智力;我们将通过将人类的智力能力组织成更高水平的协同结构来放大人类的智力。拥有放大的智能的是由此产生的H-LAM/T系统,其中LAM/T增强装置代表了人类智能的放大器。
Accepting the term “intelligence amplification” does not imply any attempt to increase native human intelligence. The term “intelligence amplification” seems applicable to our goal of augmenting the human intellect in that the entity to be produced will exhibit more of what can be called intelligence than an unaided human could; we will have amplified the intelligence of the human by organizing his intellectual capabilities into higher levels of synergistic structuring. What possesses the amplified intelligence is the resulting H-LAM/T system, in which the LAM/T augmentation means represent the amplifier of the human’s intelligence.
在增强我们的智力的过程中,我们应用了协同结构的原理,并遵循自然进化来发展人类的基本能力。我们在开发增强手段时所做的就是建造一个上层建筑,它是其所建立的自然结构的综合延伸。在非常现实的意义上,正如所代表的那样随着增强手段的稳步发展,“人工智能”的发展已经持续了几个世纪。
In amplifying our intelligence, we are applying the principle of synergistic structuring that was followed by natural evolution in developing the basic human capabilities. What we have done in the development of our augmentation means is to construct a superstructure that is a synthetic extension of the natural structure upon which it is built. In a very real sense, as represented by the steady evolution of our augmentation means, the development of “artificial intelligence” has been going on for centuries.
22.2.3.3 双域系统 人和工件是H-LAM/T 系统中唯一的物理组件。系统的最终能力将取决于它们的能力。前面的陈述暗示了这一点,即系统的每个复合过程最终都会分解为显式人类过程和显式工件过程。因此,H-LAM/T 系统中有两个独立的活动领域:以人类为代表,所有显式人类过程都在其中发生;以及由工件表示的,其中发生所有显式工件过程。在任何复合过程中,两个域之间都存在协作交互,需要能量交换(其中大部分仅用于信息交换目的)。图 22.1描述了这个两域概念并体现了下面讨论的其他概念。
22.2.3.3 Two-domain systems The human and the artifacts are the only physical components in the H-LAM/T system. It is upon their capabilities that the ultimate capability of the system will depend. This was implied in the earlier statement that every composite process of the system decomposes ultimately into explicit-human and explicit-artifact processes. There are thus two separate domains of activity within the H-LAM/T system: that represented by the human, in which all explicit-human processes occur; and that represented by the artifacts, in which all explicit-artifact processes occur. In any composite process, there is cooperative interaction between the two domains, requiring interchange of energy (much of it for information exchange purposes only). Figure 22.1 depicts this two-domain concept and embodies other concepts discussed below.
图 22.1: H-LAM/T 系统的两侧
Figure 22.1: The Two Sides of the H-LAM/T System
当复杂的机器代表人类合作的主要工件时,“人机界面”一词多年来一直被用来代表两个领域之间能量交换的边界。然而,自从人类开始使用人工制品并执行复合过程以来,“人与人工制品界面”已经存在了几个世纪。
Where a complex machine represents the principal artifact with which a human being cooperates, the term “man–machine interface” has been used for some years to represent the boundary across which energy is exchanged between the two domains. However, the “man–artifact interface” has existed for centuries, ever since humans began using artifacts and executing composite processes.
当显式人工流程与显式工件流程耦合时,就会发生跨此“接口”的交换。通常,这些耦合进程就是为了这种交换目的而设计的,以便在埋藏在各自领域内执行更重要任务的其他显式人类进程和显式工件进程之间提供功能匹配。例如,手指和手的运动(显式人类过程)激活打字机中的按键链接运动(与显式人工过程耦合)。但这些只是指导输入给定单词的更深层次的人类过程与实际上在纸上印上墨水标记的更深层次的人工过程之间的匹配过程的一部分。……
Exchange across this “interface” occurs when an explicit-human process is coupled to an explicit-artifact process. Quite often these coupled processes are designed for just this exchange purpose, to provide a functional match between other explicit-human and explicit-artifact processes buried within their respective domains that do the more significant things. For instance, the finger and hand motions (explicit human processes) activate key-linkage motions in the typewriter (couple to explicit-artifact processes). But these are only part of the matching processes between the deeper human processes that direct a given word to be typed and the deeper artifact processes that actually imprint the ink marks on the paper. …
22.3.2.1 背景 为了尝试让您(读者)对我们的论文有一种特定的感受,尽管存在这种情况,我们将通过描述如果给您一个一位友好的同事(名叫乔)进行了个人讨论演示,他是一位训练有素且经验丰富的用户,在一个实验研究项目中使用了这种增强系统,该项目比我们目前的阶段要好几年。我们假设您在接受本次演示采访时的背景与本报告前一部分所提供的背景类似,也就是说,您将听到或读到一组概括和一些相当原始的示例,但您还没有了解过。对基于计算机的增强系统如何真正帮助人有了很大的了解。
22.3.2.1 Background To try to give you (the reader) a specific sort of feel for our thesis in spite of this situation, we shall present the following picture of computer-based augmentation possibilities by describing what might happen if you were being given a personal discussion-demonstration by a friendly fellow (named Joe) who is a trained and experienced user of such an augmentation system within an experimental research program which is several years beyond our present stage. We assume that you approach this demonstration-interview with a background similar to what the previous portion of this report provides—that is, you will have heard or read a set of generalizations and a few rather primitive examples, but you will not yet have been given much of a feel for how a computer-based augmentation system can really help a person.
乔理解这一点,并解释说,他将尽最大努力为您提供所需的有效概念感觉,尝试在过于详细而失去整体视图与过于笼统而无法为您提供坚实的观点之间划定界限。感受正在发生的事情。他建议你坐下来观察他一段时间,看他从事一些典型的工作,然后他会做一些解释。你对此并没有感到特别受宠若惊,因为你知道他只是要在他的新工件上运用新的语言和方法论开发——毕竟,这些工件看起来与你的预期没有一点不同——所以为什么要这样做呢?他让你坐在那里,就好像你对这些东西完全陌生一样?正如您经常被告知的那样,这只是“让计算机为他执行一些符号操作过程,以便他可以使用更强大的概念和概念操作技术”的问题。
Joe understands this and explains that he will do his best to give you the valid conceptual feel that you want—trying to tread the narrow line between being too detailed and losing your over-all view and being too general and not providing you with a solid feel for what goes on. He suggests that you sit and watch him for a while as he pursues some typical work, after which he will do some explaining. You are not particularly flattered by this, since you know that he is just going to be exercising new language and methodology developments on his new artifacts—and after all, the artifacts don’t look a bit different from what you expected—so why should he keep you sitting there as if you were a complete stranger to this stuff? It will just be a matter of “having the computer do some of his symbol-manipulating processes for him so that he can use more powerful concepts and concept-manipulation techniques,” as you have so often been told.
乔并排有两个显示屏,但他似乎并不像另一个那样经常使用其中一个。而且屏幕几乎是水平的,更像是绘图台的表面,而不是您想象中的近乎垂直的图片显示。但你很容易看出原因,因为他在显示表面上工作就像绘图员在他的图纸上工作一样专注,而伸手到垂直的表面上进行这种工作会很尴尬。有时乔会用双手敲击键盘,显然会以很高的速度将信息输入计算机。
Joe has two display screens side by side, but one of them he doesn’t seem to use as much as the other. And the screens are almost horizontal, more like the surface of a drafting table than the near-vertical picture displays you had somehow imagined. But you see the reason easily, for he is working on the display surface as intently as a draftsman works on his drawings, and it would be awkward to reach out to a vertical surface for this kind of work. Some of the time Joe is using both hands on the keys, obviously feeding information into the computer at a great rate.
不过,另一个小小的惊喜是,您会看到每只手都在显示框架自己一侧的一组按键上进行操作,因此双手之间的距离几乎为两英尺。但很明显,这种布置使他能够以相当自然的位置保持在框架上,这样当他从空中拿起光笔时(这是它的静止位置,这要归功于一个由关节支撑臂和一个系统组成的系统) (用于连接线的受控张力和倒带系统)他的手仍在从按键组到显示框架的途中。当他用完展示架上的笔时,他松开笔,绳子重新卷起,笔又回到原来的位置。因此,在框架上进行工作所需的努力、移动和时间最少。也就是说,他可以用任意一只手轻松地从使用按键组切换到使用光笔(每只手各放置一支笔),而无需移动头部、转身或倾斜。
Another slight surprise, though—you see that each hand operates on a set of keys on its own side of the display frames, so that the hands are almost two feet apart. But it is plain that this arrangement allows him to remain positioned over the frames in a rather natural position, so that when he picks the light pen out of the air (which is its rest position, thanks to a system of jointed supporting arms and a controlled tension and rewind system for the attached cord) his hand is still on the way from the keyset to the display frame. When he is through with the pen at the display frame, he lets go of it, the cord rewinds, and the pen is again in position. There is thus a minimum of effort, movement, and time involved in turning to work on the frame. That is, he could easily shift back and forth from using keyset to using light pen, with either hand (one pen is positioned for each hand), without moving his head, turning, or leaning.
不过,乔的大部分时间似乎都花在一只手敲击键盘上,另一只手在显示屏表面上使用光笔。正是在这种工作模式下,显示屏上的图像变化最为动态。当您意识到这些显像管表面有多少活动时,您会收到另一个真正的惊喜。你问自己为什么没有为此做好准备,你被迫承认你所听到的概括并没有真正被理解——“操纵符号的新方法”是一个经常被重复的术语,但它只是没有被接受。不包括乔可以自由而快速地更改显示内容的图像,以及可以如此迅速地发生的有意义且灵活的想法和工作状态“塑造”的图像。
A good deal of Joe’s time, though, seems to be spent with one hand on a keyset and the other using a light pen on the display surface. It is in this type of working mode that the images on the display frames changed most dynamically. You receive another real surprise as you realize how much activity there is on the face of these display tubes. You ask yourself why you weren’t prepared for this, and you are forced to admit that the generalizations you had heard hadn’t really sunk in—“new methods for manipulating symbols” had been an oft-repeated term, but it just hadn’t included for you the images of the free and rapid way in which Joe could make changes in the display, and of meaningful and flexible “shaping” of ideas and work status which could take place so rapidly.
然后你意识到你根本无法理解他正在做的具体事情,也无法理解你在显示屏上看到的大部分内容。你可以认出很多单词,但有很多单词显然是某种特殊的缩写。当给定的图像或图像的一部分保持足够长的时间不变以供您研究一下时,您很少会看到任何看起来像句子的东西,因为您已经习惯了看到句子。你开始意识到还有其他符号与可能成为句子一部分的单词混合在一起,并且构成完整思想陈述的不同部分(你对句子是什么的感觉)不仅仅是被布置出来的如你所料地结束。但乔突然清除了显示屏,转向你,咧嘴一笑,这标志着被动观察期的结束,而且不知何故告诉你,他非常清楚,你现在知道你需要这样一段时间来摆脱一些你有限的想象,并真正意识到“能力层次”是一件丰富而重要的事情。
Then you realized that you couldn’t make any sense at all out of the specific things he was doing, nor of the major part of what you saw on the displays. You could recognize many words, but there were a good number that were obviously special abbreviations of some sort. During the times when a given image or portion of an image remained un changed long enough for you to study it a bit, you rarely saw anything that looked like a sentence as you were used to seeing one. You were beginning to gather that there were other symbols mixed with the words that might be part of a sentence, and that the different parts of what made a full-thought statement (your feeling about what a sentence is) were not just laid out end to end as you expected. But Joe suddenly cleared the displays and turned to you with a grin that signalled the end of the passive observation period, and also that somehow told you that he knew very well that you now knew that you had needed such a period to shake out some of your limited images and to really realize that a “capability hierarchy” was a rich and vital thing.
“我想你注意到我正在使用不熟悉的概念、符号和流程来做你更不熟悉的事情?” 你不置可否地点点头——你看不出有什么理由向他承认你甚至无法分辨出他正在做的哪些事情是与其他哪些事情合作——然后他继续说道。“为了让您了解正在发生的事情,我将开始讨论和演示我一直在使用的一些非常基本的操作和概念。我确信您已经阅读过有关流程和流程能力层次结构的内容。从过去向人们解释激进增强系统的经验中,我知道他们感兴趣的新的、强大的高级功能——因为基本上这些都是我们都渴望改进的——如果不首先给予他们,就无法真正向他们解释他们对构建它们所依赖的新的强大功能有一定的了解。这对于低级能力类型来说是正确的,这种能力对他们来说是新的和不同的,但他们通常不会认为是“强大的”。然而,如果没有它们,我们的系统就不会那么强大,如果一个人对这些基本功能以及由它们构建的层次结构没有一定的了解,那么他对系统的理解就会相当肤浅。最高水平的能力。” ……
“I guess you noticed that I was using unfamiliar notions, symbols, and processes to go about doing things that were even more unfamiliar to you?” You made a non-committal nod—you saw no reason to admit to him that you hadn’t even been able to tell which of the things he had been doing were to cooperate with which other things—and he continued. “To give you a feel for what goes on, I’m going to start discussing and demonstrating some of the very basic operations and notions I’ve been using. You’ve read the stuff about process and process-capability hierarchies, I’m sure. I know from past experience in explaining radical augmentation systems to people that the new and powerful higher-level capabilities that they are interested in—because basically those are what we are all anxious to improve—can’t really be explained to them without first giving them some understanding of the new and powerful capabilities upon which they are built. This holds true right on down the line to the type of low-level capability that is new and different to them all right, but that they just wouldn’t ordinarily see as being ‘powerful.’ And yet our systems wouldn’t be anywhere near as powerful without them, and a person’s comprehension of the system would be rather shallow if he didn’t have some understanding of these basic capabilities and of the hierarchical structure built up from them to provide the highest-level capabilities.”…
关于所提出的想法的意义和含义,可以得出三个主要结论。
Three principal conclusions may be drawn concerning the significance and implications of the ideas that have been presented.
首先,任何提高社会问题解决者智力有效利用的可能性都值得认真考虑。这是因为人解决问题的能力可能代表了社会所拥有的最重要的资源。其他首要竞争者的开发和使用都严重依赖于该资源。任何发展能够直接且显着地与该资源的持续开发相结合的艺术或科学的可能性都应该值得双重认真考虑。
First, any possibility for improving the effective utilization of the intellectual power of society’s problem solvers warrants the most serious consideration. This is because man’s problem-solving capability represents possibly the most important resource possessed by a society. The other contenders for first importance are all critically dependent for their development and use upon this resource. Any possibility for evolving an art or science that can couple directly and significantly to the continued development of that resource should warrant doubly serious consideration.
其次,所提出的想法要从上述两个意义上来考虑:直接发展的意义和“发展的艺术”的意义。诚然,这些可能性具有长期影响,但它们的追求和最初的回报现在就在等待着我们。我们认为,我们不必等到了解人类心理过程如何工作,我们不必等到我们学会如何使计算机更智能、更大或更快,我们就可以开始开发强大且经济上可行的增强系统基于我们现在所知道和拥有的信息。对进一步基础知识和改进机器的追求将持续到无限的未来,并且希望融入“艺术”及其改进的增强系统中——但现在开始不仅会为这些追求提供方向和刺激,而且会给予我们提高了解决问题的效率,从而实现了目标。
Second, the ideas presented are to be considered in both of the above senses: the direct-development sense and the “art of development” sense. To be sure, the possibilities have long-term implications, but their pursuit and initial rewards await us now. By our view, we do not have to wait until we learn how the human mental processes work, we do not have to wait until we learn how to make computers more intelligent or bigger or faster, we can begin developing powerful and economically feasible augmentation systems on the basis of what we now know and have. Pursuit of further basic knowledge and improved machines will continue into the unlimited future, and will want to be integrated into the “art” and its improved augmentation systems—but getting started now will provide not only orientation and stimulation for these pursuits, but will give us improved problem-solving effectiveness with which to carry out the pursuits.
第三,越来越明显的是,现在应该在一些研究团体中采取行动,而且规模更大,越早越好。我们提供了概念框架和行动计划,并建议认真考虑这些作为行动的基础。如果它们被考虑但被发现不可接受,那么至少应该认真和持续地努力制定一个更可接受的概念框架,在其中审视总体方法,制定一个更可接受的行动计划,或两者兼而有之。
Third, it becomes increasingly clear that there should be action now—the sooner the better—action in a number of research communities and on an aggressive scale. We offer a conceptual framework and a plan for action, and we recommend that these be considered carefully as a basis for action. If they be considered but found unacceptable, then at least serious and continued effort should be made toward developing a more acceptable conceptual framework within which to view the over-all approach, toward developing a more acceptable plan of action, or both.
这是对研究人员和那些最终激励、资助或指导他们的人的公开呼吁,要求他们认真关注发展一门动态学科的可能性,该学科可以从整体意义上解决提高智力效率的问题。该学科的目标应该是产生一个持续的改进循环——增加对问题的理解,改进开发新增强系统的方法,以及改进增强系统,为世界上一般的问题解决者,特别是该学科的工作者服务。毕竟,我们在旨在理解和利用核能的学科上花费了大量资金。为什么不考虑发展一门旨在理解和利用“神经力量”的学科呢?从长远来看,人类智力的力量确实是两者中更重要的。
This is an open plea to researchers and to those who ultimately motivate, finance, or direct them, to turn serious attention toward the possibility of evolving a dynamic discipline that can treat the problem of improving intellectual effectiveness in a total sense. This discipline should aim at producing a continuous cycle of improvements—increased understanding of the problem, improved means for developing new augmentation systems, and improved augmentation systems that can serve the world’s problem solvers in general and this discipline’s workers in particular. After all, we spend great sums for disciplines aimed at understanding and harnessing nuclear power. Why not consider developing a discipline aimed at understanding and harnessing “neural power”? In the long run, the power of the human intellect is really much the more important of the two.
经 SRI International 许可,转载自 Engelbart (1962)。
Reprinted from Engelbart (1962), with permission from SRI International.
到 20 世纪 50 年代中期,研究人员在构建和使用电子计算机方面拥有了足够的经验,开始想象计算机可能以不同的方式进化,以便更多的人可以使用它们来解决更多问题。1954 年,在麻省理工学院的一所暑期学校中,格蕾丝·霍珀 (Grace Hopper) 和约翰·巴克斯 (John Backus) 之间发生了一次引人注目的交流,约翰·巴克斯 (John Backus) 此后不久就开发了 F ORTRAN编程语言。“博士。Grace Hopper 提出了并行使用多台小型计算机的可能性。最大的需求是小型机器。…她预见到一种大规模生产的小型机器,并配有适合客户需求的编译器和库。JW Backus 先生以计算机速度为由不同意这一理念。由于提高速度的成本并不高,因此使用大型计算机比使用小型计算机更便宜。……约翰·巴科斯(John Backus)说,通过分时,一台大计算机可以用作几台小计算机;每个用户都需要一个阅读站”(Adams et al., 1954, pp. 16-1–16-2)。
By the mid-1950s, researchers had enough experience with building and using electronic computers to begin imagining different ways computers might evolve so that more people could use them to solve more problems. In a 1954 summer school at MIT, a remarkable exchange occurred between Grace Hopper and John Backus, who would soon thereafter develop the FORTRAN programming language. “Dr. Grace Hopper raised the possibility of using several small computers in parallel. The greatest demand was for small machines. … She foresaw a mass produced small machine, delivered with a compiler and library appropriate to the customer’s needs. Mr. J. W. Backus disagreed with this philosophy on the grounds of computer speed; since increased speed costs little more, a large computer is cheaper to use than a small one. … John Backus said that by time sharing, a big computer could be used as several small ones; there would need to be a reading station for each user” (Adams et al., 1954, pp. 16-1–16-2).
霍珀和巴科斯同意不同的观点,因为霍珀更多地考虑商业应用,而巴科斯更多地考虑科学用途。但分时是可以想象的。几年后,由费尔南多·科尔巴托(Fernando Corbató,1926-2019)领导的麻省理工学院的一个团队完全实现了这一目标。
Hopper and Backus agreed to differ on the basis that Hopper was thinking more about business applications while Backus was thinking about scientific uses. But time-sharing had been imagined. A few years later, a group at MIT led by Fernando Corbató (1926–2019) fully actualized it.
Corbató 于 1956 年在麻省理工学院获得物理学博士学位,并继续帮助运营计算中心。1958 年左右,当时在 MIT 任教的 John McCarthy 提议通过实施分时系统来扩展该中心 IBM 计算机的容量,正如 Licklider 报道的那样(本卷第 207 页)。1961 年,在解决了 MIT 内部的一些内讧并获得 IBM 的合作后,Corbató 在程序员 Marjorie Merwin(生于 1928 年,后来的 Daggett)和 Robert C. Daley 的协助下,拼凑出了一个基本系统。正如他在口述历史中所解释的那样,其目标主要是“让怀疑者相信这不是一项不可能完成的任务,同时让人们感受到交互式计算。让我感到惊讶的是,现在仍然令人惊讶的是,人们无法想象拥有一个交互式终端会产生什么样的心理差异。你可以在黑板上谈论它,直到你脸色发青,人们会说,“哦,是的,但你为什么需要这个?” 你知道,我们过去常常尝试思考所有这些类比,比如用给你母亲寄一封信和打电话之间的区别来描述它。直到今天,我仍然记得人们只有在看到真正的演示时才意识到,说,‘嘿,它会说话。哇!你只需输入它,就会得到答案”(Corbató 和 Norberg,1989)。
Corbató had finished his PhD in physics at MIT in 1956 and was kept on to help run the Computation Center. Around 1958 John McCarthy, then on the MIT faculty, proposed expanding the capacity of the Center’s IBM computer by implementing a time-sharing system, as reported by Licklider (page 207 of this volume). In 1961, after navigating some infighting within MIT and securing IBM’s cooperation, Corbató, assisted by programmers Marjorie Merwin (b. 1928, later Daggett) and Robert C. Daley, cobbled together a rudimentary system. The goal was mostly, as he explained in an oral history, “to convince the skeptics that it was not an impossible task, and also, to get people to get a feel for interactive computing. It was amazing to me, and it is still amazing, that people could not imagine what the psychological difference would be to have an interactive terminal. You can talk about it on a blackboard until you are blue in the face, and people would say, ‘Oh, yes, but why do you need that?’ You know, we used to try to think of all these analogies, like describing it in terms of the difference between mailing a letter to your mother and getting on the telephone. To this day I can still remember people only realizing when they saw a real demo, say, ‘Hey, it talks back. Wow! You just type that and you got an answer”’ (Corbató and Norberg, 1989).
第一个系统称为兼容分时系统 (CTSS);“兼容性”是指交互式作业和在后台运行的计算密集型作业之间的兼容性。它是 M ULTICS系统(“多重信息和计算服务”)的基础,并被其取代。麻省理工学院持续开发该系统十年,并由通用电气和霍尼韦尔将其商业化,但取得了有限的成功。贝尔实验室也参与了开发工作,但由于该项目被认为过于臃肿而退出。Ken Thompson 是那里的开发人员之一,他从 M ULTICS中吸取了教训,作为 UNIX操作系统(第 37 章)的基础,他将其命名为一个轻松的双关语。
This first system was called the Compatible Time-Sharing System (CTSS); the “compatibility” was between the interactive jobs and the compute-bound jobs that were running in the background. It was the basis for, and was supplanted by, the MULTICS system (for “MULTiplexed Information and Computing Service”). MIT continued developing the system for a decade, and it was commercialized, with limited success, by GE and then Honeywell. Bell Labs was also involved in the development effort, but pulled out when the project was judged to have become bloated. Ken Thompson was among the developers there and took the lessons learned from MULTICS as the basis for the UNIX operating system (chapter 37), which he so named as a light-hearted pun.
本文的有趣之处在于其数学性能分析以及对计算机系统早期发展的回顾。
Among the interesting aspects of this paper are its mathematical performance analysis and its retrospective on the early development of computer systems.
本文的目的是简要讨论分时的必要性、一些实现问题、为当代 IBM 7090 开发的实验性分时系统,以及最后我们其中一个人的调度算法( FJC)说明了一些可用于增强此类分时系统的性能限制并对其性能限制进行分析的技术。
IT is the purpose of this paper to discuss briefly the need for time-sharing, some of the implementation problems, an experimental time-sharing system which has been developed for the contemporary IBM 7090, and finally a scheduling algorithm of one of us (FJC) that illustrates some of the techniques which may be employed to enhance and be analyzed for the performance limits of such a time-sharing system.
过去十几年,计算机的使用取得了长足的进步。20世纪50年代初,解决的问题主要是硬件的建设和维护;20世纪50年代中期,随着编译器的出现,使用语言得到了很大的改进;现在是 20 世纪 60 年代初期,我们正处于计算机使用的第三次重大变革之中:通过称为分时的过程来改进人机交互。
The last dozen years of computer usage have seen great strides. In the early 1950s, the problems solved were largely in the construction and maintenance of hardware; in the mid-1950s, the usage languages were greatly improved with the advent of compilers; now in the early 1960s, we are in the midst of a third major modification to computer usage: the improvement of man–machine interaction by a process called time-sharing.
本文中表达的大部分分时理念是与麻省理工学院初步研究委员会(由 H. Teager 担任主席)的工作结合开发的,该委员会研究了该研究所的长期计算需求,以及随后的麻省理工学院计算机工作委员会,由 J. McCarthy 担任主席。然而,本文中表达的观点和结论应仅代表作者的观点和结论。
Much of the time-sharing philosophy, expressed in this paper, has been developed in conjunction with the work of an MIT preliminary study committee, chaired by H. Teager, which examined the long range computational needs of the Institute, and a subsequent MIT computer working committee, chaired by J. McCarthy. However, the views and conclusions expressed in this paper should be taken as solely those of the present authors.
在进一步讨论之前,最好对分时给出更精确的解释。一种可能意味着同时使用硬件的不同部分来执行不同的任务,也可能意味着多个人同时使用计算机。第一个含义通常称为多道程序设计,面向硬件效率,即试图实现所有组件的完全利用(Schmitt 和 Tonik,1959;Codd,1960;Heller,1961;Leeds 和 Weinberg,1961)。这里所说的分时的第二个含义主要与尝试使用计算机的人的效率有关(Strachey,1959;利克莱德,1960;布朗等人,1962)。计算机效率仍然应该被考虑,但只能从整个系统效用的角度来看。
Before proceeding further, it is best to give a more precise interpretation to time-sharing. One can mean using different parts of the hardware at the same time for different tasks, or one can mean several persons making use of the computer at the same time. The first meaning, often called multiprogramming, is oriented towards hardware efficiency in the sense of attempting to attain complete utilization of all components (Schmitt and Tonik, 1959; Codd, 1960; Heller, 1961; Leeds and Weinberg, 1961). The second meaning of time-sharing, which is meant here, is primarily concerned with the efficiency of persons trying to use a computer (Strachey, 1959; Licklider, 1960; Brown et al., 1962). Computer efficiency should still be considered but only in the perspective of the total system utility.
使用分时计算机的动机源于目前更大、更先进的计算机可能导致的人机交互速度缓慢。在计算机广泛使用的过去十年中,这一比率几乎没有变化(在某些情况下变得更糟)(Teager 和 McCarthy,1959)。
The motivation for time-shared computer usage arises out of the slow man–computer interaction rate presently possible with the bigger, more advanced computers. This rate has changed little (and has become worse in some cases) in the last decade of widespread computer use (Teager and McCarthy, 1959).
在某种程度上,这种效果是由于当基本问题在计算机上被掌握时,更复杂的问题立即变得有趣。结果,编写了更大、更复杂的程序来利用更大、更快的计算机。这个过程不可避免地会导致更多的编程错误和更长的调试时间。使用当前的批量监控技术(如在大多数大型计算机上所做的那样),每个程序错误通常需要几个小时(如果不是一整天)才能消除。目前唯一可用的替代方案是程序员尝试直接在计算机上进行调试,这一过程严重浪费计算机时间,并且通常会受到不良控制台通信的严重阻碍。即使打字机是控制台,通常也缺乏复杂的查询和响应程序,而这些程序对于实现有效的交互至关重要。因此,我们希望在不造成较大经济损失的情况下,大幅提高程序员与计算机之间的交互率,同时通过广泛而复杂的系统编程来辅助人机通信,使每次交互变得更有意义。
In part, this effect has been due to the fact that as elementary problems become mastered on the computer, more complex problems immediately become of interest. As a result, larger and more complicated programs are written to take advantage of larger and faster computers. This process inevitably leads to more programming errors and a longer period of time required for debugging. Using current batch monitor techniques, as is done on most large computers, each program bug usually requires several hours to eliminate, if not a complete day. The only alternative presently available is for the programmer to attempt to debug directly at the computer, a process which is grossly wasteful of computer time and hampered seriously by the poor console communication usually available. Even if a typewriter is the console, there are usually lacking the sophisticated query and response programs which are vitally necessary to allow effective interaction. Thus, what is desired is to drastically increase the rate of interaction between the programmer and the computer without large economic loss and also to make each interaction more meaningful by extensive and complex system programming to assist in the man–computer communication.
为了解决这些交互问题,我们希望有一台计算机能够以类似于电话交换机的方式同时供许多用户使用。每个用户都可以按照自己的节奏使用控制台,而不必担心使用该系统的其他人的活动。该控制台至少可以只是一台打字机,但更理想的是包含一个可增量修改的自维持显示器。无论如何,数据传输要求应该使得从计算机进行远程安装不会成为主要障碍。
To solve these interaction problems we would like to have a computer made simultaneously available to many users in a manner somewhat like a telephone exchange. Each user would be able to use a console at his own pace and without concern for the activity of others using the system. This console could at a minimum be merely a typewriter but more ideally would contain an incrementally modifiable self-sustaining display. In any case, data transmission requirements should be such that it would be no major obstacle to have remote installation from the computer proper.
分时系统的基本技术是让许多人通过打字机控制台同时使用计算机,分时管理程序在短突发或计算量中顺序运行每个用户程序。在最简单的情况下,这个序列是一个简单的循环,应该经常发生,以便保存在高速存储器中的每个用户程序在每个近似的人类反应时间内至少运行一次量子(〜 .2秒)。通过这种方式,每个用户都会看到计算机完全响应单个击键,每个击键可能只需要微不足道的计算;在重要的情况下,用户会看到响应时间逐渐减少,这与响应计算的复杂性、计算机的速度和活跃用户总数成正比。然而,应该清楚的是,如果有n 个用户同时主动请求服务,则每个用户平均只能看到 1 /n的有效计算机速度。在调试程序时的高交互率期间,这不应该成为障碍,因为通常,与最终生产需求相比,每个调试计算机响应所需的计算量很小。
The basic technique for a time-sharing system is to have many persons simultaneously using the computer through typewriter consoles with a time-sharing supervisor program sequentially running each user program in a short burst or quantum of computation. This sequence, which in the most straightforward case is a simple round-robin, should occur often enough so that each user program which is kept in the high-speed memory is run for a quantum at least once during each approximate human reaction time (∼.2 seconds). In this way, each user sees a computer fully responsive to even single key strokes each of which may require only trivial computation; in the non-trivial cases, the user sees a gradual reduction of the response time which is proportional to the complexity of the response calculation, the slowness of the computer, and the total number of active users. It should be clear, however, that if there are n users actively requesting service at one time, each user will only see on the average 1/n of the effective computer speed. During the period of high interaction rates while debugging programs, this should not be a hindrance since ordinarily the required amount of computation needed for each debugging computer response is small compared to the ultimate production need.
这样的分时系统不仅可以将传统方式的编程能力提高一两个数量级,而且还可以开辟几种新的计算机使用形式。许多科学和工程应用将逐步重新制定,以便消除目前必须提前指定的包含决策树的程序,而只根据需要指定特定的决策分支。另一个重要领域是教学机器,虽然在计算上通常很琐碎,但可以自然地利用分时系统的控制台,并具有可以使用更复杂和自适应教学程序的额外好处。最后,正如许多小型商用计算机所证明的那样,在商业和工业中有许多应用程序,在这些应用程序中,在孤立的位置拥有强大的计算设施将是有利的,而只需对每个控制台进行增量资本投资。但重要的是要认识到,即使没有上述和其他新的应用程序,分时实现的编程亲密性的重大进步对于以程序调试为主要内容的大学、研究实验室和工程公司的计算机安装也具有直接价值。问题。
Not only would such a time-sharing system improve the ability to program in the conventional manner by one or two orders of magnitude, but there would be opened up several new forms of computer usage. There would be a gradual reformulation of many scientific and engineering applications so that programs containing decision trees which currently must be specified in advance would be eliminated and instead the particular decision branches would be specified only as needed. Another important area is that of teaching machines which, although frequently trivial computationally, could naturally exploit the consoles of a time-sharing system with the additional bonus that more elaborate and adaptive teaching programs could be used. Finally, as attested by the many small business computers, there are numerous applications in business and in industry where it would be advantageous to have powerful computing facilities available at isolated locations with only the incremental capital investment of each console. But it is important to realize that even without the above and other new applications, the major advance in programming intimacy available from time-sharing would be of immediate value to computer installations in universities, research laboratories, and engineering firms where program debugging is a major problem.
如前所述,一个简单的分时计划是在简单的循环中执行小量计算的用户程序,而无需优先级。分时策略可以更复杂,如稍后所示,但上述简单方案是一个足够的解决方案。然而,仍然存在许多问题,一些问题最好通过硬件解决,另一些则影响编程约定和实践。总结了几个比较明显的问题:
As indicated, a straightforward plan for time-sharing is to execute user programs for small quantums of computation without priority in a simple round-robin; the strategy of time-sharing can be more complex as will be shown later, but the above simple scheme is an adequate solution. There are still many problems, however, some best solved by hardware, others affecting the programming conventions and practices. A few of the more obvious problems are summarized:
1. 不同的用户程序如果同时存在于核心内存中,可能会相互干扰或与管理程序发生干扰,因此在操作用户程序时应提供某种形式的内存保护模式。
1. Different user programs if simultaneously in core memory may interfere with each other or the supervisor program so some form of memory protection mode should be available when operating user programs.
2. 分时管理程序可能需要在不同时间从多个位置运行特定程序。(加载重定位位没有帮助,因为管理程序不知道如何重定位累加器等)。拾取指令或数据字的所有存储器访问的动态重定位是一种有效的解决方案。
2. The time-sharing supervisor may need at different times to run a particular program from several locations. (Loading relocation bits are no help since the supervisor does not know how to relocate the accumulator, etc.) Dynamic relocation of all memory accesses that pick up instructions or data words is one effective solution.
3. 输入输出设备可以由用户启动并读取另一个用户程序中的字。[编辑:也就是说,如果没有足够的内存保护机制,用户可能可以通过“读入”输入然后将其存储在另一个程序占用的内存“上”来破坏另一个用户的程序。] 避免的方法这是为了捕获用户程序在内存保护模式下运行时发出的所有输入输出指令。
3. Input-output equipment may be initiated by a user and read words in on another user program. [EDITOR: That is, without an adequate memory protection mechanism, a user might be able to clobber another user’s program by having input “read in” and then stored “on” the memory occupied by the other program.] A way to avoid this is to trap all input-output instructions issued by a user’s program when operated in the memory protection mode.
4. 对于所有用户的通用程序存储文件来说,需要一个大的随机存取备份存储。目前的大容量光盘单元似乎已经足够了。
4. A large random-access back-up storage is desirable for general program storage files for all users. Present large capacity disc units appear to be adequate.
5. 分时管理程序必须能够在完成一定量的计算后中断用户的程序。由程序启动的单稳态多谐振荡器在固定时间后生成中断就足够了。
5. The time-sharing supervisor must be able to interrupt a user’s program after a quantum of computation. A program-initiated one-shot multivibrator which generates an interrupt a fixed time later is adequate.
6.大的核心存储器(例如一百万字)将极大地减轻系统编程的复杂性,因为不同的活动用户程序以及频繁使用的系统程序(例如编译器、查询程序等)可以始终保留在核心存储器中。
6. Large core memories (e.g. a million words) would ease the system programming complications immensely since the different active user programs as well as the frequently used system programs such as compilers, query programs, etc. could remain in core memory at all times.
1. 管理程序必须自动进行用户使用费用统计。一般来说,应根据系统使用公式或算法向用户收费,该公式或算法应包括计算时间、所需高速存储器数量、辅助存储器租金等因素。
1. The supervisor program must do automatic user usage charge accounting. In general, the user should be charged on the basis of a system usage formula or algorithm which should include such factors as computation time, amount of high-speed memory required, rent of secondary memory storage, etc.
2. 管理程序应该协调所有用户输入输出,因为在输入输出受限操作期间要求用户程序始终保留在存储器中是不合需要的。此外,管理员必须协调为所有用户服务的中央共享高速输入输出单元以及时钟、磁盘单元等的所有使用。
2. The supervisor program should coordinate all user input-output since it is not desirable to require a user program to remain constantly in memory during input-output limited operations. In addition, the supervisor must coordinate all usage of the central, shared high-speed input-output units serving all users as well as the clocks, disc units, etc.
3. 可用的系统程序必须足够有效,以便用户可以思考他的问题,而不会受到编码细节或印刷错误的阻碍。因此,编译器、查询程序、事后分析程序、加载器和良好的编辑程序是必不可少的。
3. The system programs available must be potent enough so that the user can think about his problem and not be hampered by coding details or typographical mistakes. Thus, compilers, query programs, post-mortem programs, loaders, and good editing programs are essential.
4. 在语言选择和不受限制的情况下,应尽可能为用户提供最大的编程灵活性。
4. As much as possible, the users should be allowed the maximum programming flexibility both in choices of language and in the absence of restrictions.
1. 可能会无意中请求太大的计算量或过多的打字机输出,因此应该向用户提供特殊的终止信号。
1. Too large a computation or excessive typewriter output may be inadvertently requested so that a special termination signal should be available to the user.
2. 由于实时不是计算机使用时间,因此主管必须让每个用户了解情况,以便他能够对循环等进行判断。
2. Since real-time is not computer usage-time, the supervisor must keep each user informed so that he can use his judgment regarding loops, etc.
3. 计算机处理器、内存和磁带可能会出现故障。基本操作问题,例如“哪个程序正在运行?” 必须是负责任的并且完全预期恢复程序。
3. Computer processor, memory and tape malfunctions must be expected. Basic operational questions such as “Which program is running?” must be answerable and recovery procedures fully anticipated.
简要说明了理想的分时性能后,有必要问一下现有设备可以达到什么水平的性能。为了开始回答这个问题并探索所有编程和操作方面,一个实验性的分时系统已经已开发。该系统最初是为 IBM 709 编写的,但后来被转换为与 7090 计算机一起使用。
Having briefly stated a desirable time-sharing performance, it is pertinent to ask what level of performance can be achieved with existent equipment. To begin to answer this question and to explore all the programming and operational aspects, an experimental time-sharing system has been developed. This system was originally written for the IBM 709 but has since been converted for use with the 7090 computer.
麻省理工学院计算中心的 7090 除了具有 19 个磁带单元的三个通道外,还有一个具有标准直接数据连接的第四个通道。与直接数据连接相连的是一个实时设备缓冲区和控制机架,是在 H. Teager 及其团队的指导下设计和建造的。[Teager (1962) 目前正在使用另一种方法为 MIT7090 开发分时系统。] 该机架连接有各种设备,但当前系统所需的唯一设备是三台柔版打字机。7090 上还安装了两个特殊修改(即 RPQ):标准 60 周期计数和中断时钟,以及允许内存保护、动态重定位和捕获所有用户启动输入输出指令尝试的特殊模式。
The 7090 of the MIT Computation Center has, in addition to three channels with 19 tape units, a fourth channel with the standard Direct Data Connection. Attached to the Direct Data Connection is a real-time equipment buffer and control rack designed and built under the direction of H. Teager and his group. [Teager (1962) is presently using another approach in developing a time-sharing system for the MIT7090.] This rack has a variety of devices attached but the only ones required by the present systems are three flexowriter typewriters. Also installed on the 7090 are two special modifications (i.e. RPQ’s): a standard 60 cycle accounting and interrupt clock, and a special mode which allows memory protection, dynamic relocation and trapping of all user attempts to initiate input-output instructions.
在本系统中,分时发生在四个用户之间,其中三个用户在前台系统中的打字机前在线,第四个被动用户是后台 Fap-Mad-Madtran-BSS 监控系统,类似于 F ORTRAN -Fap-BSS 监控系统 (FMS) 被大多数中心程序员和许多其他 7090 安装使用。
In the present system the time-sharing occurs between four users, three of whom are on-line each at a typewriter in a foreground system, and a fourth passive user of the background Fap-Mad-Madtran-BSS Monitor system similar to the FORTRAN-Fap-BSS Monitor System (FMS) used by most of the Center programmers and by many other 7090 installations.
前台系统的重要设计特点是:
Significant design features of the foreground system are:
1.允许用户用与后台系统兼容的语言开发程序,
1. It allows the user to develop programs in languages compatible with the background system,
2. 开发程序的私有文件,
2. Develop a private file of programs,
3. 在前一个会话的状态下启动调试会话,并且
3. Start debugging sessions at the state of the previous session, and
4. 设定自己的节奏,很少浪费电脑时间。
4. Set his own pace with little waste of computer time.
核心存储的分配使得所有用户都在较高的 27,000 个字中进行操作,而分时管理器 (TSS) 则永久在较低的 5,000 个字中进行操作。为了避免内存分配冲突,保护用户免受彼此的影响,并简化最初的 709 系统组织,核心内存中一次只保留一个用户。然而,借助 7090 的特殊内存保护和重定位功能,可以实现更复杂的存储分配过程。在任何情况下,通过使用 2 通道重叠磁带读写两个用户程序中的相关位置,可以最大限度地减少用户交换。
Core storage is allocated such that all users operate in the upper 27,000 words with the time-sharing supervisor (TSS) permanently in the lower 5,000 words. To avoid memory allocation clashes, protect users from one another, and simplify the initial 709 system organization, only one user was kept in core memory at a time. However, with the special memory protection and relocation feature of the 7090, more sophisticated storage allocation procedures are being implemented. In any case, user swaps are minimized by using 2-channel overlapped magnetic tape reading and writing of the pertinent locations in the two user programs.
前台系统围绕每个用户可以在其打字机上发出的命令和用户的私人程序文件进行组织,这些文件目前(由于需要磁盘单元)保存在每个用户的单独磁带上。
The foreground system is organized around commands that each user can give on his typewriter and the user’s private program files which presently (for want of a disc unit) are kept on a separate magnetic tape for each user.
为了方便起见,专用磁带文件的格式是卡片图像,具有带有名称和类别指示符的标题卡,并且可以使用离线设备写入或打孔。(后一个功能还提供了大规模输入输出的粗略形式。)系统的磁带需求是后台系统正常功能所需的七个磁带,一个用于分时监控器的系统磁带,其中包含大多数命令程序,以及三个前台用户各一个专用文件磁带和转储磁带。
For convenience the format of the private tape files is such that they are card images, have title cards with name and class designators and can be written or punched using the off-line equipment. (The latter feature also offers a crude form of large-scale input-output.) The magnetic tape requirements of the system are the seven tapes required for the normal functions of the background system, a system tape for the time-sharing supervisor that contains most of the command programs, and a private file tape and dump tape for each of the three foreground users.
命令由用户输入到分时管理器(而不是他自己的程序),因此可以随时启动,而不管内存中的特定用户程序如何。出于类似的协调原因,主管处理前台系统的所有输入输出打字机。命令由由垂直笔画分隔的段组成;第一段是命令名称,其余段是与命令相关的参数。每个段由最后输入的 6 个字符组成(以隐含的 6 个空格开头),因此间距是纠正输入错误的简单方法。回车是启动命令操作的信号。每当主管收到命令时,就会输入“WAIT”,然后输入“READY”。当命令完成时。(计算机响应的颜色始终与用户键入的颜色相反。)键入时,不完整的命令行可能会被代码删除信号的“退出”序列和回车符忽略。类似地,在启动命令后,如果给出“退出”序列,则可能会放弃该命令。此外,在不需要的命令输入期间,可以通过按下特殊的“停止输出”按钮来终止命令和输出。
The commands are typed by the user to the time-sharing supervisor (not to his own program) and thus can be initiated at any time regardless of the particular user program in memory. For similar coordination reasons, the supervisor handles all input-output of the foreground system typewriters. Commands are composed of segments separated by vertical strokes; the first segment is the command name and the remaining segments are parameters pertinent to the command. Each segment consists of the last 6 characters typed (starting with an implicit 6 blanks) so that spacing is an easy way to correct a typing mistake. A carriage return is the signal which initiates action on the command. Whenever a command is received by the supervisor, “WAIT” is typed back followed by “READY.” when the command is completed. (The computer responses are always in the opposite color from the user’s typing.) While typing, an incomplete command line may be ignored by the “quit” sequence of a code delete signal followed by a carriage return. Similarly after a command is initiated, it may be abandoned if a “quit” sequence is given. In addition, during unwanted command typeouts, the command and output may be terminated by pushing a special “stop output” button.
每当打字机用户完成命令行并被放置在等待命令队列中时,就会启动前台系统的使用。每个时间片完成后,分时管理器会优先启动任何等待的命令。大部分命令对应的系统程序都保存在专用的管理命令系统磁带上,这样,为了避免浪费计算机时间,管理人员继续操作最后一个用户程序,直到磁带上所需的命令程序被定位到可供读取为止。此时,最后一个用户在其转储磁带上被读出,命令程序读入,置于工作状态并作为新用户程序启动。然而,在开始新用户进行一定量的计算之前,主管再次检查另一个用户的任何等待命令,并且如果必要的话,在操作新用户的同时开始命令系统磁带的前瞻定位。
The use of the foreground system is initiated whenever a typewriter user completes a command line and is placed in a waiting command queue. Upon completion of each quantum, the time-sharing supervisor gives top priority to initiating any waiting commands. The system programs corresponding to most of the commands are kept on the special supervisor command system tape so that to avoid waste of computer time, the supervisor continues to operate the last user program until the desired command program on tape is positioned for reading. At this point, the last user is read out on his dump tape, the command program read in, placed in a working status and initiated as a new user program. However, before starting the new user for a quantum of computation, the supervisor again checks for any waiting command of another user and if necessary begins the look-ahead positioning of the command system tape while operating the new user.
每当等待命令队列为空时,管理程序就会继续执行工作状态队列中那些前台用户程序的简单循环。最后,如果这两个队列都是空的,则后台用户程序被引入并一次运行一个量程,直到进一步的前台系统积极开发为止。
Whenever the waiting command queue is empty, the supervisor proceeds to execute a simple round-robin of those foreground user programs in the working status queue. Finally, if both these queues are empty, the background user program is brought in and run a quantum at a time until further foreground system actively develops.
前台用户程序通过两种方式离开工作状态队列。如果程序继续完成,它可以以自我消除的方式重新进入监督者并使用户处于死亡状态;或者,通过不同的条目,可以将程序置于休眠状态(或由用户执行退出序列手动置于)。休眠状态与死亡状态的不同之处在于用户仍然可以重新启动或检查他的程序。
Foreground user programs leave the working status queue by two means. If the program proceeds to completion, it can reenter the supervisor in a way which eliminates itself and places the user in dead status; alternatively, by a different entry the program can be placed in a dormant status (or be manually placed by the user executing a quit sequence). The dormant status differs from the dead status in that the user may still restart or examine his program.
用户输入输出是通过每台打字机进行的,即使管理员有几行可用的缓冲区空间,也可能会受到输入输出的限制。因此,存在一个额外的输入输出等待状态,类似于休眠状态,只要出现输入输出延迟,管理程序就会自动将用户置于该状态。当输出时缓冲区接近空或输入时缓冲区接近满时,用户程序自动返回到工作状态;从而避免浪费计算机时间。……
User input-output is through each typewriter, and even though the supervisor has a few lines of buffer space available, it is possible to become input-output limited. Consequently, there is an additional input-output wait status, similar to the dormant, which the user is automatically placed in by the supervisor program whenever input-output delays develop. When buffers become near empty on output or near full on input, the user program is automatically returned to the working status; thus waste of computer time is avoided. …
尽管迄今为止使用该系统的经验相当有限,但初步迹象表明,如果这种系统普遍可用,程序员将很容易使用它。既然已经有了系统的一些操作经验,那么询问可以进行哪些观察是有用的。[注:操作初步在 709 计算机上使用该系统获得了经验;由于设备转换困难,到 5 月 3 日为止,还无法在逻辑上等效的 7090 计算机上使用该系统。] 立即的评论是,一旦用户习惯了计算机响应,即使是零点一分钟的延迟也非常长,类似于与语速慢的人交谈的效果。同样,要求完整的打字行而不是每个字符作为人机通信的最小单位是一个抑制因素,因为按讲无线电话对话比普通电话更加生硬。由于在逐个字符的基础上保持计算机的快速响应需要在核心存储器中始终至少有一个残余响应程序,因此本系统内的直接解决方案是具有更多可用的核心存储器。至少,为分时监控器提供额外的内存可以缓解已经为 32,000 字 7090 编写的程序的兼容性问题。
Although experience with the system to date is quite limited, first indications are that programmers would readily use such a system if it were generally available. It is useful to ask, now that there is some operating experience with the system, what observations can be made. [Note: Operating experience was initially gained using the system on the 709 computer; due to equipment conversion difficulties, it was not possible to use the system on the logically equivalent 7090 computer by May 3.] An immediate comment is that once a user gets accustomed to computer response, delays of even a fraction of a minute are exasperatingly long, an effect analogous to conversing with a slow-speaking person. Similarly, the requirement that a complete typewritten line rather than each character be the minimum unit of man–computer communication is an inhibiting factor in the sense that a press-to-talk radio-telephone conversation is more stilted than that of an ordinary telephone. Since maintaining a rapid computer response on a character by character basis requires at least a vestigial response program in core memory at all times, the straightforward solution within the present system is to have more core memory available. At the very least, an extra bank of memory for the time-sharing supervisor would ease compatibility problems with programs already written for 32,000 word 7090’s.
为了方便起见,本系统最薄弱的部分是输入约定、用户文件编辑以及调试时可能的快速交互和亲密程度。由于这些领域在很大程度上涉及用户的品味、习惯和心理,因此认为正确的解决方案需要大量的实验和务实的评估;同样明显的是,不能抽象地对待这些领域,因为所使用的编程语言将极大地影响适当的技术。当然需要更多地使用位置、程序名称和变量的符号引用;符号事后分析程序、跟踪程序和前后差异转储程序应该在调试过程中发挥有用的作用。
For reasons of expediency, the weakest portions of the present system are the conventions for input, editing of user files, and the degree of rapid interaction and intimacy possible while debugging. Since to a large extent these areas involve the taste, habits, and psychology of the users, it is felt that proper solutions will require considerable experimentation and pragmatic evaluation; it is also clear that these areas cannot be treated in the abstract for the programming languages used will influence greatly the appropriate techniques. A greater use of symbolic referencing for locations, program names and variables is certainly desired; symbolic post-mortem programs, trace programs, and before-and-after differential dump programs should play useful roles in the debugging procedures.
在本系统的设计中,非常小心地使每个用户独立于其他用户。然而,如果情况并非总是如此,那么这将是系统的有用扩展。特别是,当在计算机控制的组中使用多个控制台时,例如在管理或战争游戏、群体行为研究中或可能在教学机器中,希望所有控制台都与单个程序进行通信。
In the design of the present system, great care went into making each user independent of the other users. However, it would be a useful extension of the system if this were not always the case. In particular, when several consoles are used in a computer controlled group such as in management or war games, in group behavior studies, or possibly in teaching machines, it would be desirable to have all the consoles communicating with a single program.
本系统内需要进一步改进的另一个领域是文件维护,因为目前使用的磁带单元阻碍了用户程序文件的容易删除。光盘单元将在该领域以及解决由许多控制台用户产生的大规模中央输入输出的合并和调度问题方面提供帮助。
Another area for further improvement within the present system is that of file maintenance, since the presently used tape units are a hindrance to the easy deletion of user program files. Disc units will be of help in this area as well as with the problem of consolidating and scheduling large-scale central input-output generated by the many console users.
最后,人们认为消除前台系统和后台系统之间的区别是可取的。当今的计算机操作员将扮演后台用户的替身角色,使用与系统中其他用户控制台非常相似的操作员控制台,按照主管的要求安装和拆卸磁带,接收读卡指令类似地,当前台用户对他的程序感到满意时,他将通过他的控制台和管理程序将他的程序输入到要执行的生产后台工作的队列中。通过实施这些程序,是否分时的区别将消失,并且计算机用户可以以可互换的方式自由选择他认为在特定时间更适合的操作模式。
Finally, it is felt that it would be desirable to have the distinction between the foreground and background systems eliminated. The present-day computer operator would assume the role of a stand-in for the background users, using an operator console much like the other user consoles in the system, mounting and demounting magnetic tapes as requested by the supervisor, receiving instructions to read card decks into the central disc unit, etc. Similarly the foreground user, when satisfied with his program, would by means of his console and the supervisor program enter his program into the queue of production background work to be performed. With these procedures implemented the distinction of whether one is time-sharing or not would vanish and the computer user would be free to choose in an interchangeable way that mode of operation which he found more suitable at a particular time.
无论是100万字的核心内存还是7090目前的32000字内存,都不可避免地面临系统饱和的问题,即活动用户程序的总大小超过高速内存的大小,或者出现系统饱和的问题。活动的用户程序太多,无法在每个用户控制台上保持足够的响应。如果某些用户程序的大小或时间要求过大,即使只有少数用户,也很容易出现这些情况。如果假设系统的良好设计是具有饱和过程,该饱和过程可以适度降低大型且长时间运行的用户的响应时间和有效的实时计算速度,则可以缓解困境。
Regardless of whether one has a million word core memory or a 32,000 word memory as currently exists on the 7090, one is inevitably faced with the problem of system saturation where the total size of active user programs exceeds that of the high-speed memory or there are too many active user programs to maintain an adequate response at each user console. These conditions can easily arise with even a few users if some of the user programs are excessive in size or in time requirements. The predicament can be alleviated if it is assumed that a good design for the system is to have a saturation procedure which gives graceful degradation of the response time and effective real-time computation speed of the large and long-running users.
为了显示一般问题,图 23.1定性地给出了用户服务作为n(活跃用户数量)的函数。该服务参数可能是两个关键因素之一:计算机响应时间或实时计算速度的n倍。无论哪种情况,都存在一些临界数量的活跃用户N,代表有效用户容量,这会导致饱和。如果接近饱和的策略是执行所有用户的简单循环,那么由于突然出现在辅助存储器中交换程序进出所需的大量时间,因此服务会突然崩溃,例如作为光盘或鼓单元。当然,图 23.1是相当定性的,因为它主要取决于用户程序大小的范围以及用户操作时间的范围。
To show the general problem, Figure 23.1 qualitatively gives the user service as a function of n, the number of active users. This service parameter might be either of the two key factors: computer response time or n times the real-time computation speed. In either case there is some critical number of active users, N, representing the effective user capacity, which causes saturation. If the strategy near saturation is to execute the simple round-robin of all users, then there is an abrupt collapse of service due to the sudden onset of the large amount of time required to swap programs in-and-out of the secondary memory such as a disc or drum unit. Of course, Figure 23.1 is quite qualitative since it depends critically on the spectrum of user program sizes as well as the spectrum of user operating times.
图 23.1: 服务与活跃用户数量
Figure 23.1: Service vs. number of active users
为了说明可用于提高分时系统饱和性能的策略,提出了一种多级调度算法。还可以分析该算法以给出系统性能的广泛范围。多级调度算法的基础是将每个进入系统运行(或完成对用户的响应)的用户程序分配到第ℓ级优先级队列。程序最初进入级别ℓ 0,对应于它们的大小,使得
To illustrate the strategy that can be employed to improve the saturation performance of a time-sharing system, a multi-level scheduling algorithm is presented. This algorithm also can be analyzed to give broad bounds on the system performance. The basis of the multi-level scheduling algorithm is to assign each user program as it enters the system to be run (or completes a response to a user) to an ℓth level priority queue. Programs are initially entered into a level ℓ0, corresponding to their size such that
式中,w p为程序中的字数,w q为在一个量程q的时间内可从辅助存储器进出高速存储器的字数,括号中表示“的组成部分。” 通常,作为基本时间单位的量子时间应该尽可能小,而不会造成过多的开销损失。主管从高速存储器中的一个程序切换到另一个程序。该过程从分时管理程序开始,在最低级别占用队列ℓ的头部运行程序,持续最多 2 ℓ量子时间,然后如果程序未完成(即未对用户做出响应) )将其放在ℓ +1级队列的末尾。如果低于ℓ的级别没有程序进入系统,则继续此过程,直到级别ℓ的队列耗尽为止;然后,该过程在级别ℓ + 1处再次开始迭代,现在每个程序运行 2 ℓ +1量子时间。如果在执行级别ℓ的程序的2 ℓ量子期间,较低级别ℓ ' 被占用,则当前用户将被替换为第ℓ队列的头部,并且进程在级别ℓ ' 重新启动。
where wp is the number of words in the program, wq is the number of words which can be transmitted in and out of the high-speed memory from the secondary memory in the time of one quantum, q, and the bracket indicates “the integral part of.” Ordinarily the time of a quantum, being the basic time unit, should be as small as possible without excessive overhead losses when the supervisor switches from one program in high-speed memory to another. The process starts with the time-sharing supervisor operating the program at the head of the lowest level occupied queue, ℓ, for up to 2ℓ quanta of time and then if the program is not completed (i.e. has not made a response to the user) placing it at the end of the ℓ + 1 level queue. If there are no programs entering the system at levels lower than ℓ, this process proceeds until the queue at level ℓ is exhausted; the process is then iteratively begun again at level ℓ + 1, where now each program is run for 2ℓ+1 quanta of time. If during the execution of the 2ℓ quanta of a program at level ℓ, a lower level, ℓ′, becomes occupied, the current user is replaced at the head of the ℓth queue and the process is reinitiated at level ℓ′.
类似地,如果级别ℓ的大小为w p的程序在运行期间请求分时管理程序更改内存大小,则该程序的放大(或缩小)版本应放置在 ℓ ′′的末尾排队在哪里
Similarly, if a program of size wp at level ℓ, during operation requests a change in memory size from the time-sharing supervisor, then the enlarged (or reduced) version of the program should be placed at the end of the ℓ′′ queue where
再次,队列头用户在最低占用级别ℓ ' 处重新启动该过程。
Again the process is re-initiated with the head-of-the-queue user at the lowest occupied level of ℓ′.
从上述算法可以得出几个重要的结论,这些结论允许系统的性能受到限制。
Several important conclusions can be drawn from the above algorithm which allow the performance of the system to be bounded.
1.计算效率。由于程序运行的时间总是大于或等于交换时间(即,将程序移入和移出辅助存储器所需的时间),因此计算效率永远不会低于二分之一。(显然,这个分数在初始水平ℓ 0的公式中是可以调整的。)查看这个界限的另一种方法是说, n 个活跃用户中的一个可用的实时计算速度并不比有有 2 n 个活跃用户,他们的所有程序都在高速存储器中。
1. Computational efficiency. Because a program is always operated for a time greater than or equal to the swap time (i.e., the time required to move the program in and out of secondary memory), it follows that the computational efficiency never falls below one-half. (Clearly, this fraction is adjustable in the formula for the initial level, ℓ0.) An alternative way of viewing this bound is to say that the real-time computing speed available to one out of n active users is no worse than if there were 2n active users all of whose programs were in the high-speed memory.
2.响应时间。如果最大活跃用户数为N,则给定程序大小的单个用户可以保证响应时间,
2. Response time. If the maximum number of active users is N, then an individual user of a given program size can be guaranteed a response time,
因为最坏的情况发生在所有竞争的用户程序处于同一级别时。相反,如果t r是任意值的保证响应并且假定程序的最大尺寸,则最大允许的活动用户数是有界的。
since the worst case occurs when all competing user programs are at the same level. Conversely, if tr is a guaranteed response of arbitrary value and the largest size of program is assumed, then the maximum permissible number of active users is bounded.
3.长跑。长时间运行时的相对交换时间可以变得非常小。得出这个结论是因为程序运行的时间越长,它级联到的级别数就越高,相对交换时间也相应更小。该算法的一个重要特征是,长时间运行必须有效地证明它们很长,以便快速检测到意外终止的程序。为了使层数有限,可以建立最大层数L,使得渐近交换开销为某个任意小的百分比,p:
3. Long runs. The relative swap time on long runs can be made vanishingly small. This conclusion follows since the longer a program is run, the higher the level number it cascades to with a correspondingly smaller relative swap time. It is an important feature of the algorithm that long runs must in effect prove they are long so that programs which have an unexpected demise are detected quickly. In order that there be a finite number of levels, a maximum level number, L, can be established such that the asymptotic swap overhead is some arbitrarily small percentage, p:
其中w pmax是最大可能程序的大小。
where wpmax is the size of the largest possible program.
4.多级响应时间与单级响应时间。相同大小的程序、同时进入系统并运行多个量子的响应时间不比单量子循环过程中发生的响应时间差大约两倍。如果在第ℓ层的队列中启动了n 个大小相等的程序,那么最坏的情况是队列末端程序准备在第ℓ + j层的第一个量子运行时做出响应。使用多级算法,队列末尾程序的总延迟取决于量子的几何级数:
4. Multi-level vs. single-level response times. The response time for programs of equal size, entering the system at the same time, and being run for multiple quanta, is no worse than approximately twice the response-time occurring in a single quanta round-robin procedure. If there are n equal sized programs started in a queue at level ℓ, then the worst case is that of the end-of-the-queue program which is ready to respond at the very first quantum run at the ℓ + j level. Using the multi-level algorithm, the total delay for the end-of-the-queue program is by virtue of the geometric series of quanta:
由于队列末端用户计算了 2 ℓ (2 j − 1) 量子的时间,因此响应之前的等效单级循环延迟为:
Since the end-of-the-queue user has computed for a time of 2ℓ(2j − 1) quanta, the equivalent single-level round-robin delay before a response is:
因此
Hence
并显示断言。应当注意,当所有程序保留在高速存储器中时,省略程序交换时间的上述条件对于多级算法来说是最不利的;如果上述分析中包含交换次数,则T m /T s的比率只能变得更小,并且可能变得远小于1。通过类似的分析,很容易表明,即使在没有程序交换的不利情况下,在量子完成时终止的队列头程序在多级算法下也会收到两倍的响应与单级循环赛(即 )下的速度一样快。
and the assertion is shown. It should be noted that the above conditions, where program swap times are omitted, which are pertinent when all programs remain in high-speed memory, are the least favorable for the multi-level algorithm; if swap times are included in the above analysis, the ratio of Tm/Ts can only become smaller and may become much less than unity. By a similar analysis it is easy to show that even in the unfavorable case where there are no program swaps, head-of-the-queue programs that terminate just as the quanta are completed receive under the multi-level algorithm a response which is twice as fast as that under the single-level round-robin (i.e. ).
5.最高的服务水平。在多级算法中,程序的级别分类过程是完全自动的,取决于性能和程序大小而不是每个用户的声明(或希望)。当用户对系统征税时,从大型或长时间运行的程序的较高级别用户开始,服务逐渐退化;然而,在某些级别,由于较低级别的活动用户过多,可能无法运行任何用户程序。为了确定这个截止点的界限,我们考虑N 个处于级别ℓ的活跃用户,每个用户运行 2 ℓ量子,终止,并稍后在用户响应时间t u时在级别ℓ再次重新进入系统。如果级别ℓ + 1处没有服务,则计算时间Nq 2 ℓ必须大于或等于tu。因此,给出了保证的活性水平由关系式:
5. Highest serviced level. In the multi-level algorithm the level classification procedure for programs is entirely automatic, depending on performance and program size rather than on the declarations (or hopes) of each user. As a user taxes the system, the degradation of service occurs progressively starting with the higher level users of either large or long-running programs; however, at some level no user programs may be run because of too many active users at lower levels. To determine a bound on this cut-off point we consider N active users at level ℓ each running 2ℓ quanta, terminating, and reentering the system again at level ℓ at a user response time, tu, later. If there is to be no service at level ℓ + 1, then the computing time, Nq2ℓ, must be greater than or equal to tu. Thus the guaranteed active levels, are given by the relation:
在极限情况下,t u可以小到最小用户反应时间(∼.2秒),但由于大量用户的统计,预期值会大几个数量级。
In the limit, tu could be as small as a minimum user reaction time (∼.2 sec.), but the expected value would be several orders of magnitude greater as a result of the statistics of a large number of users.
当盘或鼓单元用作辅助存储器时,上面制定的多级算法没有明确考虑在将程序传输到盘或鼓单元或从盘或鼓单元传输程序之前所需的寻道或等待时间(尽管形式上,因子w q 可以包含平均值)。这些时间的数字)。对算法的一种简单修改通常可以避免浪费寻道或等待时间,即继续操作最后一个用户程序所需的量子数,以准备好新用户与最低优先级用户的交换;由于通常只有较高级别的程序才会被强制输出到辅助存储器中,因此旧用户在寻找新用户时的扩展操作量应该只是对基本算法的轻微扭曲。
The multi-level algorithm as formulated above makes no explicit consideration of the seek or latency time required before transmission of programs to and from disc or drum units when they are used as the secondary memory, (although formally the factor wq could contain an average figure for these times). One simple modification to the algorithm which usually avoids wasting the seek or latency time is to continue to operate the last user program for as many quanta as are required to ready the swap of the new user with the least priority user; since ordinarily only the higher level number programs would be forced out into the secondary memory, the extended quanta of operation of the old user while seeking the new user should be but a minor distortion of the basic algorithm.
当硬件合适时,进一步的复杂性是可能的。在具有输入输出通道和与辅助存储器之间的传输速率较低的计算机中,在操作当前用户的同时,新老用户在高速存储器中的读写可能会重叠。其效果相当于使用鼓提供 100% 多路复用器使用率,但有两个缺点,即,没有单个用户可以利用所有可用的用户存储空间,并且只要发生意外的调度更改(例如,程序),前瞻过程就会崩溃。终止或启动更高优先级的用户程序)。
Further complexities are possible when the hardware is appropriate. In computers with input-output channels and low transmission rates to and from secondary memory, it is possible to overlap the reading and writing of the new and old users in and out of high-speed memory while operating the current user. The effect is equivalent to using a drum giving 100% multiplexor usage but there are two liabilities, namely, no individual user can utilize all the available user memory space and the look-ahead procedure breaks down whenever an unanticipated scheduling change occurs (e.g. a program terminates or a higher-priority user program is initiated).
存储分配也可能很复杂,但当然是一个基本过程,并且具有低传输速率辅助存储器的理想过程是,每当有足够的碎片未使用存储空间可用于读取新的存储空间时,将所有高优先级用户程序合并在单个块中。用户程序。这样的过程在多级调度算法的流程图中示出,如图23.2所示。
Complexity is also possible in storage allocation but certainly an elementary procedure and a desirable one with a low-transmission rate secondary memory is to consolidate in a single block all high-priority user programs whenever sufficient fragmentary unused memory space is available to read in a new user program. Such a procedure is indicated in the flow diagram of the multi-level scheduling algorithm which is given as Figure 23.2.
图23.2: 多级调度算法流程图
Figure 23.2: Flow chart of the multi-level scheduling algorithm
还需要注意的是,图23.2仅考虑了处于工作状态的程序的调度,仍然没有考虑处于休眠(或输入输出等待状态)的程序的存储分配。处理这种情况的一种系统方法是修改调度算法,使得在级别ℓ休眠的程序进入级别ℓ + 1 的队列。调度算法像以前一样进行,休眠程序继续级联,但在以下情况下不运行:他们排在队伍的最前面。每当必须从高速存储器中删除程序时,就从最高占用层数的队列末尾选择程序。
It should also be noted that Figure 23.2 only accounts for the scheduling of programs in a working status and still does not take into account the storage allocation of programs which are in a dormant (or input-output wait status). One systematic method of handling this case is to modify the scheduling algorithm so that programs which become dormant at level ℓ are entered into the queue at level ℓ + 1. The scheduling algorithm proceeds as before with the dormant programs continuing to cascade but not operating when they reached the head of a queue. Whenever a program must be removed from high-speed memory, a program is selected from the end-of-the-queue of the highest occupied level number.
最后,将多级调度算法边界应用于当代IBM 7090具有启发性。获得以下近似值:
Finally, it is illuminating to apply the multi-level scheduling algorithm bounds to the contemporary IBM 7090. The following approximate values are obtained:
使用任意标准,编程最大大小为 32,000 个单词应该总是能得到一些服务,也就是说 max ℓ a = max ℓ 0,我们保守估计N可以为 4,并且在最坏的情况下响应一个简单的回复时间将是 32 秒。
Using the arbitrary criteria that programs up to the maximum size of 32,000 words should always get some service, which is to say that maxℓa = maxℓ0, we deduce as a conservative estimate that N can be 4 and that at worst the response time for a trivial reply will be 32 seconds.
所达到的小N值是由慢盘字传输速率导致的小w q值的直接结果。该速率仅为最大核心内存复用器速率的 3.3%。有趣的是,使用当前设计的大容量高速鼓,例如在 SAGE 系统或 IBM Sabre 系统中,将有可能获得接近 100% 的多路复用器利用率,从而将w q乘以 30 倍。紧随其后的是,与上面给出的盘单元的用户响应时间相当的用户响应时间将为30倍的人数或120个用户提供;然而,总计算能力不会改变。
The small value of N arrived at is a direct consequence of the small value of wq that results from the slow disc word transmission rate. This rate is only 3.3% of the maximum core memory multiplexor rate. It is of interest that using high-capacity high-speed drums of current design such as in the SAGE System or in the IBM Sabre System it would be possible to attain nearly 100% multiplexor utilization and thus multiply wq by a factor of 30. It immediately follows that user response times equivalent to those given above with the disc unit would be given to 30 times as many persons or to 120 users; the total computational capacity, however, would not change.
在任何情况下,对容量和计算机响应时间估计都应非常谨慎,因为它们严重依赖于用户响应时间tu的分布函数、用户程序大小以及每个用户请求的计算能力。过去使用传统编程系统的经验几乎没有什么帮助,因为这些分发功能将非常依赖于分时用户可用的编程系统以及将逐渐发展的用户习惯模式。
In any case, considerable caution should be used with capacity and computer response time estimates since they are critically dependent upon the distribution functions for the user response time, tu, and the user program size, and the computational capacity requested by each user. Past experience using conventional programming systems is of little assistance because these distribution functions will depend very strongly upon the programming systems made available to the time-sharing users as well as upon the user habit patterns which will gradually evolve.
总之,很明显,当代计算机和硬件足以为有限数量的用户提供适度的性能分时。有几个问题可以通过仔细的硬件设计来解决,但在拥有一个足够的分时系统之前还必须编写大量复杂的系统程序。任何未来的分时计算机的一个重要方面是,在系统编程完成之前,尤其是关键的分时管理器,计算机是完全没有价值的。因此,对于未来的系统设计和实现来说,在现有计算机上以原型形式探索和理解分时系统问题的各个方面至关重要,以便在计算机组织和使用方面取得重大进展。
In conclusion, it is clear that contemporary computers and hardware are sufficient to allow moderate performance time-sharing for a limited number of users. There are several problems which can be solved by careful hardware design, but there are also a large number of intricate system programs which must be written before one has an adequate time-sharing system. An important aspect of any future time-shared computer is that until the system programming is completed, especially the critical time-sharing supervisor, the computer is completely worthless. Thus, it is essential for future system design and implementation that all aspects of time-sharing system problems be explored and understood in prototype form on present computers so that major advances in computer organization and usage can be made.
转载自 Corbató 等人。(1962),经计算机协会许可。
Reprinted from Corbató et al. (1962), with permission from the Association for Computing Machinery.
伊万·萨瑟兰(Ivan Sutherland,生于 1938 年)在成长过程中,用他身为土木工程师的父亲丢弃的蓝图覆盖了他的课本。(他的母亲说,家里买不起同学们使用的大学主题书籍封面。)他在课堂无聊的时候盯着书籍封面,学会了简洁、优雅的蓝图语言。当他本科学习工程学时,他必须创建蓝图并解释它们,而他缺乏画好蓝图的灵巧和耐心。因此,当他作为研究生进入麻省理工学院并发现自己有时间使用 TX-2 计算机并且需要博士论文主题时,他想到对计算机进行编程来帮助完成绘图任务。萨瑟兰了解布什(第 11 章)和利克莱德(第 20 章)的愿景,但实际上并没有建成像他们想象的那样的东西。这篇论文是在克劳德·香农(Claude Shannon)的指导下撰写的,从而诞生了整个计算机图形和交互式计算领域。
While he was growing up, Ivan Sutherland (b. 1938) covered his school books with blueprints that his father, a civil engineer, had discarded. (His mother said that the family could not afford the college-themed book covers his schoolmates were using.) He learned the concise, elegant language of blueprints by staring at his book covers during moments of classroom boredom. When he studied engineering as an undergraduate, he had to create blueprints as well as to interpret them, and he lacked the dexterity and patience to draw them well. So when he entered MIT as a graduate student and found himself with time on the TX-2 computer and in need of a PhD thesis topic, it occurred to him to program the computer to help out with the drawing task. Sutherland was aware of the vision of both Bush (chapter 11) and Licklider (chapter 20), but nothing much like what they imagined had actually been built. This dissertation was written under the direction of Claude Shannon, and thus was born the entire field of computer graphics and interactive computing.
TX-2 是晶体管化的,当时是世界上最强大的计算机。它有 64K 的 36 位字——是其他机器内存的两倍。它还有一个原始的显示器——只有一个圆形阴极射线管,以及一条机器指令,该指令会在指令中指定的坐标处闪烁一个点。(此类显示器并不新鲜,它们曾是 EDVAC 设计的一部分。有关 1960 年 Licklider 对显示技术局限性的分析,请参阅第 210 页。)当时还没有光栅、线条图或字符显示器;只有一种显示技术。要绘制一条线,软件必须规定要显示的连续点的坐标,然后快速连续地闪烁每个点,重复该过程,给人一种同时显示整条线的错觉。更复杂的图像可以用同样的方式在屏幕上绘制,只要整个过程花费不到三十分之一秒左右,以避免烦人的闪烁。
The TX-2 was transistorized, and at the time was the most powerful computer in the world. It had 64K of 36-bit words—twice as much memory as any other machine. It also had a primitive display—just a circular cathode ray tube, and a machine instruction that would flash a point at coordinates specified in the instruction. (Such displays were not new—they had been part of the EDVAC design. See page 210 for Licklider’s analysis of the limitations of display technology in 1960.) There were no raster, line-drawing, or character displays at the time; to draw a line, the software had to stipulate the coordinates of the successive points to be displayed, and then flash each point in rapid succession, repeating the process to give the illusion that the whole line was being displayed at once. More complicated images could be painted on the screen in the same way, as long as the whole process took less than a thirtieth of a second or so, in order to avoid annoying flicker.
为了创建绘图程序,萨瑟兰还需要一个图形输入设备。TX-2 有一个光笔——一个通过电线连接到计算机的光电管。如果将光电管放在所显示图像的一部分上,程序可以通过将光电管激活的时间与当时显示的点的位置相关联来确定其位置。通过在该位置显示一个小十字,并检查十字的哪些部分从一个瞬间到下一个瞬间对光电管可见,程序可以确定用户移动光笔的方向。
To create a drawing program, Sutherland also needed a graphic input device. The TX-2 had a light pen—a photocell connected to the computer by an electric wire. If the photocell was held over part of the displayed image, the program could determine where it was by correlating the time the photocell was activated to the position of the point being displayed at that moment. By displaying a small cross at that position and checking which parts of the cross were visible to the photocell from one instant to the next, the program could determine the direction in which the user was moving the light pen.
在接下来的十年里,所有这些工具都被更好的工具所取代——光纤电缆、平板电脑、触摸屏、光栅显示器、彩色显示器等等。萨瑟兰持久的智力贡献是人机共生的一种形式,正如利克莱德所预测的那样。画板允许用户规定约束,例如哪些线应该平行或应该在端点相交。当对象被拖动和扭曲时,程序会尽力满足所有约束。因此它理解了绘图的拓扑结构和几何形状。画板具有分组、复制、旋转和移动对象的功能。这是第一个计算机辅助设计程序。更值得注意的是,它对约束满足和对象层次结构等新颖编程技术的使用为非过程编程和面向对象系统设计播下了种子。
All these instrumentalities were replaced by better ones in the following decade—fiber optic cables, tablets, touchscreens, raster displays, color displays, and so on. Sutherland’s durable intellectual contribution was a form of human–computer symbiosis, exactly what Licklider had predicted. Sketchpad allowed the user to stipulate constraints, for example about which lines should be parallel or should meet at their endpoints. The program did its best to keep all constraints satisfied as objects were dragged and distorted. Thus it understood the topology as well as the geometry of the drawing. Sketchpad had facilities for grouping, reduplicating, rotating, and moving objects around. It was the first computer-aided design program. Even more remarkably, its use of novel programming techniques such as constraint satisfaction and object hierarchies sowed the seeds of both non-procedural programming and of object-oriented system design.
1963 年左右,当萨瑟兰参观德克萨斯州的贝尔直升机公司时,他看到了一个设计用于帮助飞行员在夜间着陆的巧妙装置。安装在直升机底部的红外摄像机会根据飞行员的头部运动自动定向,摄像机图像通过棱镜投射到飞行员的视野中。Sutherland 提出了用计算机生成的图像代替红外摄像机图像的想法,因此诞生了第一个虚拟现实系统,该系统于 1968 年在哈佛大学建立。头戴式显示器通过伸缩管固定在天花板上,使佩戴者能够观看漂浮在太空中的线框立方体并在其周围行走。萨瑟兰继续在犹他州埃文斯和萨瑟兰计算机公司为第一个实用飞行模拟器构建图形系统。随着计算机图形学在接下来的十年中朝着极其逼真的方向发展,萨瑟兰将他的努力转向了非常快速的电子电路的设计,他仍然活跃在这个领域。
When Sutherland visited Bell Helicopter Company in Texas around 1963, he was shown a clever contraption designed to aid pilots landing at night. An infrared camera mounted on the bottom of the helicopter was automatically oriented in coordination with the pilot’s head movements, and the camera image was projected through prisms into the pilot’s field of view. Sutherland had the idea of substituting a computer-generated image for the infrared camera image, and thus was born the first virtual reality system, built at Harvard in 1968. A head-mounted display was tethered to the ceiling by telescoping tubes, enabling the wearer to view a wire-frame cube floating in space and to walk through and around it. Sutherland went on to build the graphic system for the first practical flight simulator at the Evans and Sutherland Computer Company in Utah. As computer graphics developed toward extraordinary verisimilitude in the next decade, Sutherland shifted his efforts to the design of very fast electronic circuits, a field in which he remains active.
另一项传记细节已被证明对计算机科学的发展很重要。1964 年,当 JCR Licklider 离开 ARPA 进入私营企业时,萨瑟兰接替了他,当时萨瑟兰刚刚从研究生院毕业,正在履行他作为后备军官训练队学员所承担的军事义务。在 Licklider 和 Sutherland 在 ARPA 任职期间,富有远见的资助计划刺激了整个美国的计算机科学研究,这也是创造未来的一部分。
One other biographical detail has proved important to the development of computer science. When J. C. R. Licklider left ARPA in 1964 to go to private industry, he was replaced by Sutherland, then fresh out of graduate school and fulfilling the military obligation he had incurred as a ROTC cadet. During the tenure of Licklider and Sutherland at ARPA, visionary funding initiatives stimulated computer science research all across the U.S. That too was part of the creation of the future.
Sketchpad系统使用绘图作为计算机的一种新颖的通信媒介。该系统包含输入、输出和计算程序,使其能够解释直接在计算机显示器上绘制的信息。它已被用来绘制电气、机械、科学、数学和动画绘图;它是一个通用系统。画板在帮助理解过程方面显示出最有用的功能,例如可以用图片描述的链接概念。画板也可以轻松绘制高度重复或高精度的绘图,并更改以前用它绘制的绘图。本论文中的许多图画都是用画板绘制的。
THE Sketchpad system uses drawing as a novel communication medium for a computer. The system contains input, output, and computation programs which enable it to interpret information drawn directly on a computer display. It has been used to draw electrical, mechanical, scientific, mathematical, and animated drawings; it is a general purpose system. Sketchpad has shown the most usefulness as an aid to the understanding of processes, such as the notion of linkages, which can be described with pictures. Sketchpad also makes it easy to draw highly repetitive or highly accurate drawings and to change drawings previously drawn with it. The many drawings in this thesis were all made with Sketchpad.
画板用户使用“光笔”直接在计算机显示屏上绘制草图。光笔既用于在显示器上定位绘图的某些部分,也用于指向它们以进行更改。一组按钮控制要进行的更改,例如“擦除”或“移动”。除传说外,没有使用任何书面语言。绘制的信息可以包括直线段和圆弧。任意符号可以由线段、圆弧和先前定义的符号的任何集合来定义。用户可以根据自己的意愿定义和使用任意数量的符号。符号定义的任何更改都会立即显示在该符号出现的任何地方。
A Sketchpad user sketches directly on a computer display with a “light pen.” The light pen is used both to position parts of the drawing on the display and to point to them to change them. A set of push buttons controls the changes to be made such as “erase,” or “move.” Except for legends, no written language is used. Information sketched can include straight line segments and circle arcs. Arbitrary symbols may be defined from any collection of line segments, circle arcs, and previously defined symbols. A user may define and use as many symbols as he wishes. Any change in the definition of a symbol is at once seen wherever that symbol appears.
画板存储有关绘图拓扑的显式信息。如果用户移动多边形的一个顶点,则相邻的两侧都会移动。如果用户移动某个符号,则附加到该符号的所有线条将自动移动以保持附加状态。用户在绘制草图时会自动指示绘图的拓扑连接。由于画板能够以人类完全自然的图像语言接受来自人类的拓扑信息,因此它可以用作需要拓扑数据的计算程序(例如电路模拟器)的输入程序。
Sketchpad stores explicit information about the topology of a drawing. If the user moves one vertex of a polygon, both adjacent sides will be moved. If the user moves a symbol, all lines attached to that symbol will automatically move to stay attached to it. The topological connections of the drawing are automatically indicated by the user as he sketches. Since Sketchpad is able to accept topological information from a human being in a picture language perfectly natural to the human, it can be used as an input program for computation programs which require topological data, e.g., circuit simulators.
画板本身能够移动绘图的各个部分,以满足用户可能应用的新条件。用户用光笔和按钮指示条件。例如,为了使两条线平行,他连续用光笔指向两条线并按下按钮。条件本身显示在绘图上,以便可以用光笔语言擦除或更改它们。任何条件组合都可以定义为复合条件并在一个步骤中应用。
Sketchpad itself is able to move parts of the drawing around to meet new conditions which the user may apply to them. The user indicates conditions with the light pen and push buttons. For example, to make two lines parallel, he successively points to the lines with the light pen and presses a button. The conditions themselves are displayed on the drawing so that they may be erased or changed with the light pen language. Any combination of conditions can be defined as a composite condition and applied in one step.
向画板词汇表中添加全新类型的条件非常容易。由于条件可能涉及任何可计算的内容,因此画板可用于解决非常广泛的问题。例如,画板已被用来查找用其绘制的桁架桥构件中的力分布。
It is easy to add entirely new types of conditions to Sketchpad’s vocabulary. Since the conditions can involve anything computable, Sketchpad can be used for a very wide range of problems. For example, Sketchpad has been used to find the distribution of forces in the members of truss bridges drawn with it.
画板绘图以特殊设计的“环”结构存储在计算机中。环形结构的特点是无需搜索即可快速处理拓扑信息。描述了画板中用于操纵环结构的基本操作。
Sketchpad drawings are stored in the computer in a specially designed “ring” structure. The ring structure features rapid processing of topological information with no searching at all. The basic operations used in Sketchpad for manipulating the ring structure are described.
画板系统使人和计算机能够通过线条图进行快速对话。迄今为止,由于需要将所有通信减少为可以键入的书面陈述,因此人与计算机之间的大多数交互都已减慢。过去,我们一直在给计算机写信,而不是与计算机交谈。对于许多类型的通信,例如描述机械零件的形状或电路的连接,键入的语句可能很麻烦。画板系统通过消除打字语句(图例除外)而代之以线条图,开辟了人机交流的新领域。
The Sketchpad system makes it possible for a man and a computer to converse rapidly through the medium of line drawings. Heretofore, most interaction between men and computers has been slowed down by the need to reduce all communication to written statements that can be typed; in the past, we have been writing letters to rather than conferring with our computers. For many types of communication, such as describing the shape of a mechanical part or the connections of an electrical circuit, typed statements can prove cumbersome. The Sketchpad system, by eliminating typed statements (except for legends) in favor of line drawings, opens up a new area of man–machine communication.
实际实施绘图系统的决定反映了我们的感觉,即只有通过实际尝试才能获得有用的设施的知识。然而,实际实施绘图系统的决定并不意味着要使用强力技术来将普通绘图工具计算机化。这项工作的研究性质隐含着,应该发现简单的新设施,这些设施一旦实施,应该可用于广泛的应用,最好包括一些不可预见的应用。事实证明,计算机绘图的属性与纸质绘图完全不同,不仅因为计算机提供的准确性、绘图方便性和擦除速度,而且主要是因为能够移动绘图部分在计算机上绘图,无需擦除它们。如果没有开发出一个可行的系统,我们的思维就会受到一生在纸上绘图的强烈影响,而无法发现计算机可以提供的许多有用的服务。
The decision actually to implement a drawing system reflected our feeling that knowledge of the facilities which would prove useful could only be obtained by actually trying them. The decision actually to implement a drawing system did not mean, however, that brute force techniques were to be used to computerize ordinary drafting tools; it was implicit in the research nature of the work that simple new facilities should be discovered which, when implemented, should be useful in a wide range of applications, preferably including some unforeseen ones. It has turned out that the properties of a computer drawing are entirely different from a paper drawing not only because of the accuracy, ease of drawing, and speed of erasing provided by the computer, but also primarily because of the ability to move drawing parts around on a computer drawing without the need to erase them. Had a working system not been developed, our thinking would have been too strongly influenced by a lifetime of drawing on paper to discover many of the useful services that the computer can provide.
随着工作的进展,已经发现并实施了一些简单且适用范围广泛的设施。它们提供了在绘图上包含任意符号的子图功能、以任何可计算方式关联绘图各部分的约束功能以及从简单原子约束的组合构建复杂关系的定义复制功能。当与演示性光笔语言给出的指向图像部分的能力相结合时,子图像、约束和定义复制功能产生了一个具有非凡力量的系统。正如一开始所希望的那样,该系统可用于广泛的应用,并且意想不到的用途正在出现。
As the work has progressed, several simple and very widely applicable facilities have been discovered and implemented. They provide a subpicture capability for including arbitrary symbols on a drawing, a constraint capability for relating the parts of a drawing in any computable way, and a definition copying capability for building complex relationships from combinations of simple atomic constraints. When combined with the ability to point at picture parts given by the demonstrative light pen language, the subpicture, constraint, and definition copying capabilities produce a system of extraordinary power. As was hoped at the outset, the system is useful in a wide range of applications, and unforeseen uses are turning up.
为了了解目前系统的可能性,让我们考虑使用它来绘制图 24.1的六边形图案。我们将使用一组按钮发出特定命令,使用开关打开和关闭功能,使用光笔指示位置信息并指向现有绘图部分,通过转动旋钮旋转和放大图片部分,并观察显示器上的绘图系统。林肯实验室的 TX-2 计算机(Clark 等人,1957)提供的该设备如图 24.2所示。当我们的绘图完成后,它可以像论文中的所有绘图一样,通过绘图仪(EAI,1959)在纸上涂上墨水,如图 24.3所示。我们通过这个例子的目的是展示计算机可以做什么来帮助我们绘图,同时将其如何执行其功能的详细信息留给后续章节。
To understand what is possible with the system at present let us consider using it to draw the hexagonal pattern of Figure 24.1. We will issue specific commands with a set of push buttons, turn functions on and off with switches, indicate position information and point to existing drawing parts with the light pen, rotate and magnify picture parts by turning knobs, and observe the drawing on the display system. This equipment as provided at Lincoln Laboratory’s TX-2 computer (Clark et al., 1957) is shown in Figure 24.2. When our drawing is complete it may be inked on paper, as were all the drawings in the thesis, by the plotter (EAI, 1959) shown in Figure 24.3. It is our intent with this example to show what the computer can do to help us draw while leaving the details of how it performs its functions for the chapters which follow.
图 24.1: 六角形图案。
Figure 24.1: Hexagonal pattern.
图 24.2: TX-2 操作区域——使用中的画板。显示屏上可以看到一座桥的一部分……。作者手持光笔。用于控制特定绘图功能的按钮位于作者前面的盒子上。在作者身后可以看到部分拨动开关。显示屏上看到的总图像部分的大小和位置是通过桌子上方的四个黑色旋钮获得的。
Figure 24.2: TX-2 operating area—Sketchpad in use. On the display can be seen part of a bridge …. The Author is holding the Light pen. The push buttons used to control specific drawing functions are on the box in front of the Author. Part of the bank of toggle switches can be seen behind the Author. The size and position of the part of the total picture seen on the display is obtained through the four black knobs just above the table.
图 24.3: 与画板一起使用的绘图仪。数字和模拟控制系统使绘图仪可以在 TX-2 的直接控制下或通过打孔纸带离线绘制直线和圆。
Figure 24.3: Plotter used with Sketchpad. A digital and analog control system makes the plotter draw straight lines and circles either under direct control of the TX-2 or off-line from punched paper tape.
图 24.4: 直线和圆的绘制。
Figure 24.4: Line and circle drawing.
图 24.5: 说明性示例。
Figure 24.5: Illustrative example.
为了使六边形成为正六边形,我们可以将它内接在一个圆上。要绘制圆圈,我们将光笔放置在中心位置,然后按“圆圈中心”按钮,留下一个中心点。现在,在圆上选择一个点(固定半径),再次按下“绘制”按钮,这次得到一条圆弧,其长度仅由光笔位置控制,如图 24.4所示。
To make the hexagon regular, we can inscribe it in a circle. To draw the circle we place the light pen where the center is to be and press the button “circle center,” leaving behind a center point. Now, choosing a point on the circle (which fixes the radius), we press the button “draw” again, this time getting a circle arc whose length only is controlled by light pen position as shown in Figure 24.4.
接下来,我们通过指向六边形的一个角并按下“移动”按钮将六边形移动到圆圈中,以便该角跟随光笔,拉伸其后面的两条橡皮筋线段。通过指向圆并进行终止轻弹,我们表明角位于圆上。每个角以这种方式以大致相等的间距移动到圆上,如图24.5D所示。
Next we move the hexagon into the circle by pointing to a corner of the hexagon and pressing the button “move” so that the corner follows the light pen, stretching two rubber band line segments behind it. By pointing to the circle and giving the termination flick we indicate that the corner is to lie on the circle. Each corner is in this way moved onto the circle at roughly equal spacing around it as shown in Figure 24.5D.
我们已经指出,六边形的顶点位于圆上,并且在我们进一步的操作中它们将保留在圆上。如果我们还坚持六边形的边长相等,则将构造出正六边形。我们可以通过指向一侧并按“复制”按钮,然后指向另一侧并进行终止轻拂来完成此操作。在这种情况下,该按钮复制等长线的定义并将其应用到指示的线。我们已经说过,实际上,使这条线的长度与那条线的长度相等。我们通过五个这样的语句表明所有六行的长度相等。每当我们打开切换开关时,计算机都会满足所有现有条件(如果可能)。完成后,我们就有了一个完整的内切于圆的正六边形。我们可以通过指向圆圈的任何部分并按“删除”按钮来擦除整个圆圈。完成的六边形如图 24.5F所示。
We have indicated that the vertices of the hexagon are to lie on the circle, and they will remain on the circle throughout our further manipulations. If we also insist that the sides of the hexagon be of equal length, a regular hexagon will be constructed. This we can do by pointing to one side and pressing the “copy” button, and then to another side and giving the termination flick. The button in this case copies a definition of equal length lines and applies it to the lines indicated. We have said, in effect, make this line equal in length to that line. We indicate that all six lines are equal in length by five such statements. The computer satisfies all existing conditions (if it is possible) whenever we turn on a toggle switch. This done, we have a complete regular hexagon inscribed in a circle. We can erase the entire circle by pointing to any part of it and pressing the “delete” button. The completed hexagon is shown in Figure 24.5F.
为了制作图 24.1的六边形图案,我们希望通过角将大量六边形连接在一起,因此我们通过指向每个角并按下按钮来将六边形的六个角指定为连接点。现在,我们将基本的六边形归档,并通过更改开关设置开始在新的“纸”上工作。在新的纸张上,我们通过按下按钮将每个六边形创建为子图片来组装,围绕中心七分之一的六个六边形的大致位置如图24.5G所示。子图片可以用光笔整体定位,用旋钮旋转或缩放,并通过笔弹终止信号固定在适当的位置;但它们的内部形状是固定的。通过指向一个六边形的角,按下按钮,然后指向另一个六边形的角,我们可以将这些角固定在一起,因为这些角已被指定为附着点。如果我们将每个外六边形的两个角附加到内六边形的相应角上,则这七个角是唯一相关的,并且计算机将重新定位它们,如图24.5H所示。一整组六边形一旦组装起来,就可以被视为一个符号。整个组可以作为子图在另一张“纸”上调用,并与其他组或单个六边形组装在一起,形成一个非常大的图案。使用图 24.5H七次,我们得到图 24.1的模式。使用画板系统构建图 24.1的图案只需不到五分钟。
To make the hexagonal pattern of Figure 24.1 we wish to attach a large number of hexagons together by their corners, and so we designate the six corners of our hexagon as attachment points by pointing to each and pressing a button. We now file away the basic hexagon and begin work on a fresh “sheet of paper” by changing a switch setting. On the new sheet we assemble, by pressing a button to create each hexagon as a subpicture, six hexagons around a central seventh in approximate position as shown in Figure 24.5G. Subpictures may be positioned, each in its entirety, with the light pen, rotated or scaled with the knobs and fixed in position by the pen flick termination signal; but their internal shape is fixed. By pointing to the corner of one hexagon, pressing a button, and then pointing to the corner of another hexagon we can fasten those corners together, because these corners have been designated as attachment points. If we attach two corners of each outer hexagon to the appropriate corners of the inner hexagon, the seven are uniquely related, and the computer will reposition them as shown in Figure 24.5H. An entire group of hexagons, once assembled, can be treated as a symbol. The entire group can be called up on another “sheet of paper” as a subpicture and assembled with other groups or with single hexagons to make a very large pattern. Using Figure 24.5H seven times we get the pattern of Figure 24.1. Constructing the pattern of Figure 24.1 takes less than five minutes with the Sketchpad system.
24.2.2.1 子图 最初的六边形也可能是其他任何东西:晶体管的图片、滚柱轴承、飞机机翼、一封信或本报告的整个图形。如果需要,可以根据其他更简单的符号来绘制任何数量的不同符号,并且可以根据需要经常使用任何符号。
24.2.2.1 Subpicture The original hexagon might just as well have been anything else: a picture of a transistor, a roller bearing, an airplane wing, a letter, or an entire figure for this report. Any number of different symbols may be drawn, in terms of other simpler symbols if desired, and any symbol may be used as often as desired.
24.2.2.2 约束 当我们要求六边形的顶点位于圆上时,我们利用了系统中内置的图片部分之间的基本关系。使线条垂直、水平、平行或垂直的基本关系(原子约束);使点位于直线或圆上;使符号直立、垂直排列或大小相等;并将符号与其他绘图部分(例如点和线)相关联已包含在系统中。编写新的约束类型非常容易,以至于原子约束集在大约两天的时间内从附录 A 中列出的 5 个扩展到 17 个;可以根据需要添加专门的约束类型。[编辑:附录已省略。]
24.2.2.2 Constraint When we asked that the vertices of the hexagon lie on the circle we were making use of a basic relationship between picture parts that is built into the system. Basic relationships (atomic constraints) to make lines vertical, horizontal, parallel, or perpendicular; to make points lie on lines or circles; to make symbols appear upright, vertically above one another or be of equal size; and to relate symbols to other drawing parts such as points and lines have been included in the system. It is so easy to program new constraint types that the set of atomic constraints was expanded from five to the seventeen listed in Appendix A in a period of about two days; specialized constraint types may be added as needed. [EDITOR: Appendix omitted.]
24.2.2.3 定义复制 在上面的介绍性示例中,我们通过按下按钮并指向相关边来要求六边形的边长相等。这里我们使用了系统的定义复制功能。如果我们定义了一个复合操作,例如使两条线平行且长度相等,我们就可以同样轻松地应用它。可以从应用于各个图片部分的基本约束来定义的操作的数量几乎是无限的。定期制定有用的新定义;它们就像水平线一样简单,也像带有箭头和正确指示线长度的数字的尺寸线一样复杂。定义复制功能使得约束功能的使用变得容易。
24.2.2.3 Definition copying In the introductory example above we asked that the sides of the hexagon be equal in length by pressing a button while pointing to the side in question. Here we were using the definition copying capability of the system. Had we defined a composite operation such as to make two lines both parallel and equal in length, we could have applied it just as easily. The number of operations which can be defined from the basic constraints applied to various picture parts is almost unlimited. Useful new definitions are drawn regularly; they are as simple as horizontal lines and as complicated as dimension lines complete with arrowheads and a number which indicates the length of the line correctly. The definition copying capability makes using the constraint capability easy.
正是这种存储绘图各部分相互关联的信息的能力,使得画板变得非常有用。例如,图24.6所示的连杆是用画板在短短几分钟内绘制出来的。对连杆施加约束以保持其各个成员的长度恒定。短中央连杆的旋转应该垂直移动虚线的左端。由于有关链接属性的准确信息已存储在画板,当短中心连杆旋转时,可以观察整个连杆的运动。图 24.6中的数字值被限制为指示虚线的长度,将实际运动与连杆右侧的垂直线进行比较。人们可以观察到,对于连杆的所有位置,虚线的长度都是恒定的,这表明这确实是直线连杆。使用画板制作的移动绘图的其他示例可以在最后一章中找到。
It is this ability to store information relating the parts of a drawing to each other that makes Sketchpad most useful. For example, the linkage shown in Figure 24.6 was drawn with Sketchpad in just a few minutes. Constraints were applied to the linkage to keep the length of its various members constant. Rotation of the short central link is supposed to move the left end of the dotted line vertically. Since exact information about the properties of the linkage has been stored in Sketchpad, it is possible to observe the motion of the entire linkage when the short central link is rotated. The value of the number in Figure 24.6 was constrained to indicate the length of the dotted line, comparing the actual motion with the vertical line at the right of the linkage. One can observe that for all positions of the linkage the length of the dotted line is constant, demonstrating that this is indeed a straight line linkage. Other examples of moving drawings made with Sketchpad may be found in the final chapter.
图 24.6: 连杆的四个位置。数字表示虚线的长度。
Figure 24.6: Four positions of linkage. Number shows length of dotted line.
除了存储绘图的各个部分如何相关之外,画板还存储所使用的子图的结构。例如,图 24.1的六边形图案的存储表明该图案由较小的图案组成,而较小的图案又由较小的图案组成,而较小的图案又由单个六边形组成。如果改变主六边形,则六边形图案的整个外观将会改变。当然,图案的结构是相同的。例如,如果我们将基本六边形改为半圆形,则立即产生如图 24.7所示的鱼鳞图案。
As well as storing how the various parts of the drawing are related, Sketchpad stores the structure of the subpicture used. For example, the storage for the hexagonal pattern of Figure 24.1 indicates that this pattern is made of smaller patterns which are in turn made of smaller patterns which are composed of single hexagons. If the master hexagon is changed, the entire appearance of the hexagonal pattern will be changed. The structure of the pattern will, of course, be the same. For example, if we change the basic hexagon into a semicircle, the fish scale pattern shown in Figure 24.7 instantly results.
图 24.7: 同一格子上的半六边形和半圆形。
Figure 24.7: Half hexagons and semicircles on same lattice.
由于画板存储绘图的结构,因此画板绘图明确指示符号的相似性。例如,在电气绘图中,所有晶体管符号都是根据单个主晶体管绘图创建的。如果对基本晶体管符号进行一些更改,则此更改会立即出现在所有晶体管符号中,而无需进一步努力。最重要的是,计算机“知道”电路中的那个位置应该有一个“晶体管”。它不需要解释我们很容易将其识别为晶体管符号的线集合。由于画板存储了绘图的拓扑结构,就像我们在闭合六边形时看到的那样,因此当使用画板绘制电路时,可以指示电路的外观及其电气连接。人们可以看到电路连接被存储,因为移动组件会自动移动该组件上的任何接线以保持正确的连接。画板电路图很快将用作电路模拟器的输入。画出电路后,我们就会发现它的电气特性。
Since Sketchpad stores the structure of a drawing, a Sketchpad drawing explicitly indicates similarity of symbols. In an electrical drawing, for example, all transistor symbols are created from a single master transistor drawing. If some change to the basic transistor symbol is made, this change appears at once in all transistor symbols without further effort. Most important of all, the computer “knows” that a “transistor” is intended at that place in the circuit. It has no need to interpret the collection of lines which we would easily recognize as a transistor symbol. Since Sketchpad stores the topology of the drawing as we saw in closing the hexagon, one indicates both what a circuit looks like and its electrical connections when one draws it with Sketchpad. One can see that the circuit connections are stored because moving a component automatically moves any wiring on that component to maintain the correct connections. Sketchpad circuit drawings will soon be used as inputs for a circuit simulator. Having drawn a circuit one will find out its electrical properties.
24.2.5.1 对现有图形进行小的更改 每次绘制图形时,该图形的描述都会以易于传输到磁带的形式存储在计算机中。因此,随着时间的推移,图纸库将会发展起来,其中的一部分可以用在其他图纸中,而只需投入原始图纸时间的一小部分。由于存储在计算机中的绘图可能在其约束中包含设计条件的明确表示,因此对关键部分的手动更改将自动导致对相关部分的适当更改。
24.2.5.1 For making small changes to existing drawings Each time a drawing is made, a description of that drawing is stored in the computer in a form that is readily transferred to magnetic tape. Thus, as time passes, a library of drawings will develop, parts of which may be used in other drawings at only a fraction of the investment of time that was put into the original drawing. Since a drawing stored in the computer may contain explicit representation of design conditions in its constraints, manual change of a critical part will automatically result in appropriate changes to related parts.
24.2.5.2 为了获得对可以图形化描述的操作的科学或工程理解 画板系统中存储的绘图的描述不仅仅是静态绘图部分、直线和曲线等的集合。画板系统中的绘图可能包含关于其各部分之间的关系的明确陈述,以便当一个部分发生变化时,这一变化的含义在整个附图中变得显而易见。正如我们在图 24.6中看到的,可以赋予线固定长度的属性,以便研究机械连接,观察某些部件移动时某些部件的路径。正如我们在图 24.7中看到的那样,子图片定义中所做的任何更改都会立即反映在该子图片的外观中,无论它发生在哪里。通过进行此类更改,可以获得对复杂子图片集的关系的理解。例如,人们可以研究晶体结构基本元素的变化如何反映在整个晶体中。
24.2.5.2 For gaining scientific or engineering understanding of operations that can be described graphically The description of a drawing stored in the Sketchpad system is more than a collection of static drawing parts, lines and curves, etc. A drawing in the Sketchpad system may contain explicit statements about the relations between its parts so that as one part is changed the implications of this change become evident throughout the drawing. It is possible, as we saw in Figure 24.6, to give the property of fixed length to lines so as to study mechanical linkages, observing the path of some parts when others are moved. As we saw in Figure 24.7 any change made in the definition of a subpicture is at once reflected in the appearance of that subpicture wherever it may occur. By making such changes, understanding of the relationships of complex sets of subpictures can be gained. For example, one can study how a change in the basic element of a crystal structure is reflected throughout the crystal.
24.2.5.3 作为电路模拟器等的拓扑输入设备 由于画板的环形结构存储反映了任何电路或图形的拓扑,因此它可以作为许多网络或电路模拟程序的输入。如果可以通过模拟所绘制的电路来获得电路的属性,则使用画板系统完全从头开始绘制电路所需的额外工作可能会得到补偿。
24.2.5.3 As a topological input device for circuit simulators, etc. Since the ring structure storage of Sketchpad reflects the topology of any circuit or diagram, it can serve as an input for many network or circuit simulating programs. The additional effort required to draw a circuit completely from scratch with the Sketchpad system may well be recompensed if the properties of the circuit are obtainable through simulation of the circuit drawn.
24.2.5.4 对于高度重复的绘图 计算机能够在按下按钮时在任何地方再现任何绘制的符号,并递归地在子图片中包含子图片,这使得生成由大量形状相似的部分组成的绘图变得容易。 。记忆开发和微逻辑等领域的人们对此非常感兴趣,在这些领域中,通过摄影可以同时生成大量元素流程。可以轻松绘制必要的重复图案的主图。同样,更改重复结构的单个元素并将更改立即引入所有子元素的能力使得可以更改数组的元素而无需重新绘制整个数组。……
24.2.5.4 For highly repetitive drawings The ability of the computer to reproduce any drawn symbol anywhere at the press of a button, and to recursively include subpictures within subpictures makes it easy to produce drawings which are composed of huge numbers of parts all similar in shape. Great interest in doing this comes from people in such fields as memory development and micro logic where vast numbers of elements are to be generated at once through photographic processes. Master drawings of the repetitive patterns necessary can be easily drawn. Here again, the ability to change the individual element of the repetitive structure and have the change at once brought into all subelements makes it possible to change the elements of an array without redrawing the entire array. …
…如果我有工作要做,我可以重新开始,了解通用结构、将子程序分离为适用于所有类型的图片部分的通用子程序和特定于特定类型的图片部分的子程序,以及功能的无限适用性(例如,任何东西都应该是可移动的)将比实现这些目标所付出的努力得到更多的回报。我非常钦佩那些能够一直告诉我这些事情的人,但就我个人而言,我必须遵循本章中描述的绊脚石轨迹才能说服自己。希望未来的工作者能够立即掌握普遍性的力量并为之奋斗,或者有勇气像我一样沿着一条跌跌撞撞的道路直到实现。……
… Had I the work to do again, I could start afresh with the sure knowledge that generic structure, separation of subroutines into general purpose ones applying to all types of picture parts and ones specific to particular types of picture parts, and unlimited applicability of functions (e.g. anything should be moveable) would more than recompense the effort involved in achieving them. I have great admiration for those people who were able to tell me these things all along, but I, personally, had to follow the stumbling trail described in this chapter to become convinced myself. It is to be hoped that future workers can either grasp the power of generality at once and strive for it or have the courage to stumble along a trail like mine until they achieve it. …
[T]这里有一些尚未梦想过的系统应用可能性。定义复制功能的丰富可能性,以及为了特殊目的而可以容易地添加到系统中的新类型的约束表明,进一步的应用将带来系统应用的新知识体系。例如,本文末尾显示的桥梁设计示例是没有预料到的。当然,该系统也有其局限性。在最后一章中提出了改进建议,其中一些只是微小的更改,但一些重大的添加将改变系统的整个特征。希望未来的工作能远远超过我的努力。
[T]here are possibilities for application of the system not yet even dreamed of. The richness of the possibilities of the definition copying function, and the new types of constraints which might easily be added to the system for special purposes suggest that further application will bring about a new body of knowledge of system application. For example, the bridge design examples shown at the end of this paper were not anticipated. There are, of course, limitations to the system. In the last chapter are suggested the improvements, some just minor changes, but some major additions which would change the entire character of the system. It is to be hoped that future work will far surpass my effort.
经麻省理工学院许可,转载自 Sutherland(1963)。
Reprinted from Sutherland (1963), with permission from the Massachusetts Institute of Technology.
晶体管是二十世纪最伟大的发明之一。到 20 世纪 50 年代中期,晶体管体积小、凉爽、可靠且功耗极低,在许多应用中取代了真空管。第一台电池供电的袖珍收音机于 1954 年左右出现;它们是对带有发光管的插入式桌面收音机的奇迹般的改进。很明显,晶体管不仅可以用作放大器,还可以用作开关,从而刺激了 TX-2 等电子计算机的设计。
The transistor was one of the greatest inventions of the twentieth century. Small, cool, reliable, and drawing very little power, by the mid-1950s transistors were replacing vacuum tubes in many applications. The first battery-operated pocket radios appeared around 1954; they were a miraculous improvement over plug-in tabletop radios with glowing tubes. It became apparent that transistors could be used not just as amplifiers but as switches, stimulating the design of electronic computers like the TX-2.
然而,没有人看到真正的革命即将到来:小型化。20 世纪 60 年代初,通过光刻技术大规模生产整个电子电路成为可能。起初,硅芯片仅容纳少量晶体管,但数量迅速增加。1965 年,戈登·摩尔(生于 1929 年)应《电子》 (一本商业杂志,而不是学术期刊)的要求,预测未来十年半导体工程将发生什么。这篇文章就是他的回应。摩尔当时担任仙童半导体公司的研发总监,根据他对芯片设计和制造改进的观察,他预测芯片上的元件数量将每年翻一番。(他后来将这一预测降低到每两年翻一番。)
And yet no one saw the real revolution coming: miniaturization. In the early 1960s it became possible to mass-produce entire electronic circuits by photolithography. At first the silicon chips accommodated only a handful of transistors, but the numbers increased rapidly. In 1965, Gordon Moore (b. 1929) was asked by Electronics—a trade magazine, not a scholarly journal—to predict what would happen to semiconductor engineering over the next decade. This article was his response. Moore was at the time the director of research and development at Fairchild Semiconductor, and based on his observation of the improvements in the design and manufacture of chips, he predicted that the number of a components on a chip would double yearly. (He later ramped down that prediction to doubling every two years.)
这一预测很快就变成了一个挑战。它当然不是、也从来不是所谓的“法律”。虽然令人惊讶的是,这个预测在三十或四十代中都是正确的,但从一开始就很明显,晶体管及其互连的尺寸最终必须变得与组成它们的分子的尺寸相当。随着这一点的临近,“定律”被概括为指通过某种方式实现的速度或计算能力的提高,而不是将更多的组件封装到芯片上。
The prediction soon turned into a challenge; it certainly is not, and never has been, the “law” that it was dubbed. While it is astonishing that the prediction held true for thirty or forty generations, it was obvious from the beginning that eventually the sizes of the transistors and their interconnections would have to become comparable to the size of the molecules of which they were composed. As that point approached, the “law” was generalized to refer to increases in speed or computing capacity achievable in some way other than packing more components onto a chip.
尽管如此,摩尔的原创文章还是非常有先见之明的,它以其他方式呼吁人们关注制造过程中工作组件的产量、消费者版本电子计算机的最终出现以及其他方面。
Be that as it may, Moore’s original article is remarkably prescient in calling attention to the yield of working components in the manufacturing process, to the eventual emergence of a consumer version of the electronic computer, and in other ways.
1968年,摩尔和罗伯特·诺伊斯共同创立了英特尔公司,至今仍是世界上最大的半导体公司之一。
In 1968, Moore and Robert Noyce co-founded Intel Corporation, still one of the largest semiconductor companies in the world.
集成电子学的未来就是电子学本身的未来。集成的优势将带来电子产品的普及,将这门科学推向许多新领域。
THE future of integrated electronics is the future of electronics itself. The advantages of integration will bring about a proliferation of electronics, pushing this science into many new areas.
集成电路将带来诸如家用计算机(或者至少是连接到中央计算机的终端)、汽车自动控制以及个人便携式通信设备等奇迹。如今,电子手表只需要一个显示屏就可以实现。
Integrated circuits will lead to such wonders as home computers—or at least terminals connected to a central computer—automatic controls for automobiles, and personal portable communications equipment. The electronic wristwatch needs only a display to be feasible today.
但最大的潜力在于大型系统的生产。在电话通信中,数字滤波器中的集成电路将分离多路复用设备上的通道。集成电路还将切换电话电路并执行数据处理。
But the biggest potential lies in the production of large systems. In telephone communications, integrated circuits in digital filters will separate channels on multiplex equipment. Integrated circuits will also switch telephone circuits and perform data processing.
计算机将变得更加强大,并且将以完全不同的方式进行组织。例如,由集成电子器件构建的存储器可以分布在整个机器中,而不是集中在中央单元中。此外,集成电路提高的可靠性将允许构建更大的处理单元。与现有机器类似的机器将以更低的成本和更快的周转时间制造。
Computers will be more powerful, and will be organized in completely different ways. For example, memories built of integrated electronics may be distributed throughout the machine instead of being concentrated in a central unit. In addition, the improved reliability made possible by integrated circuits will allow the construction of larger processing units. Machines similar to those in existence today will be built at lower costs and with faster turnaround.
我所说的集成电子学,是指当今被称为微电子学的所有各种技术,以及导致作为不可约单元提供给用户的电子功能的任何其他技术。这些技术在 20 世纪 50 年代末首次得到研究。目标是使电子设备小型化,以在有限的空间内以最小的重量包含日益复杂的电子功能。几种方法的发展,包括单个元件、薄膜结构和半导体集成电路的微组装技术。
By integrated electronics, I mean all the various technologies which are referred to as microelectronics today as well as any additional ones that result in electronics functions supplied to the user as irreducible units. These technologies were first investigated in the late 1950s. The object was to miniaturize electronics equipment to include increasingly complex electronic functions in limited space with minimum weight. Several approaches evolved, including microassembly techniques for individual components, thin-film structures, and semiconductor integrated circuits.
每种方法都迅速发展并趋同,因此每种方法都借鉴了另一种方法的技术。许多研究人员认为,未来的道路是各种方法的结合。
Each approach evolved rapidly and converged so that each borrowed techniques from another. Many researchers believe the way of the future to be a combination of the various approaches.
半导体集成电路的倡导者已经通过将薄膜直接应用于有源半导体基板来利用薄膜电阻器的改进特性。那些提倡基于薄膜的技术的人正在开发用于将有源半导体器件附着到无源薄膜阵列的复杂技术。
The advocates of semiconductor integrated circuitry are already using the improved characteristics of thin-film resistors by applying such films directly to an active semiconductor substrate. Those advocating a technology based upon films are developing sophisticated techniques for the attachment of active semiconductor devices to the passive film arrays.
这两种方法都效果良好,并已在当今的设备中使用。
Both approaches have worked well and are being used in equipment today.
集成电子公司今天成立。它的技术对于新的军事系统几乎是强制性的,因为其中一些系统所需的可靠性、尺寸和重量只有通过集成才能实现。阿波罗载人登月计划等计划通过证明完整的电路功能与最好的单个晶体管一样不会出现故障,证明了集成电子器件的可靠性。
Integrated electronics is established today. Its techniques are almost mandatory for new military systems, since the reliability, size, and weight required by some of them is achievable only with integration. Such programs as Apollo, for manned moon flight, have demonstrated the reliability of integrated electronics by showing that complete circuit functions are as free from failure as the best individual transistors.
商用计算机领域的大多数公司都拥有采用集成电子设备设计或早期生产的机器。这些机器比使用“传统”电子设备的机器成本更低,性能更好。
Most companies in the commercial computer field have machines in design or in early production employing integrated electronics. These machines cost less and perform better than those which use “conventional” electronics.
各种仪器,尤其是使用数字技术的数量迅速增加的仪器,开始使用集成,因为它可以降低制造和设计成本。
Instruments of various sorts, especially the rapidly increasing numbers employing digital techniques, are starting to use integration because it cuts costs of both manufacture and design.
线性集成电路的使用仍然主要限于军事领域。这种集成功能价格昂贵,而且无法满足大部分线性电子器件的需求。但第一个应用开始出现在商业电子领域,特别是需要小尺寸低频放大器的设备。
The use of linear integrated circuitry is still restricted primarily to the military. Such integrated functions are expensive and not available in the variety required to satisfy a major fraction of linear electronics. But the first applications are beginning to appear in commercial electronics, particularly in equipment which needs low-frequency amplifiers of small size.
几乎在所有情况下,集成电子器件都表现出高可靠性。即使在目前的生产水平(与分立元件相比较低)下,它也能降低系统成本,并且在许多系统中实现了性能的提高。
In almost every case, integrated electronics has demonstrated high reliability. Even at the present level of production—low compared to that of discrete components—it offers reduced systems cost, and in many systems improved performance has been realized.
集成电子学将使电子技术在整个社会中更加普遍,执行许多目前其他技术无法充分完成或根本无法完成的功能。主要优点是降低成本并大大简化设计——低成本功能包的现成供应带来的回报。
Integrated electronics will make electronic techniques more generally available throughout all of society, performing many functions that presently are done inadequately by other techniques or not done at all. The principal advantages will be lower costs and greatly simplified design—payoffs from a ready supply of low-cost functional packages.
对于大多数应用,半导体集成电路将占主导地位。半导体器件是目前存在的集成电路有源元件的唯一合理候选者。无源半导体元件看起来也很有吸引力,因为它们具有低成本和高可靠性的潜力,但只有在精度不是首要要求的情况下才可以使用它们。
For most applications, semiconductor integrated circuits will predominate. Semiconductor devices are the only reasonable candidates presently in existence for the active elements of integrated circuits. Passive semiconductor elements look attractive too, because of their potential for low cost and high reliability, but they can be used only if precision is not a prime requisite.
硅可能仍然是基本材料,尽管其他材料将用于特定应用。例如,砷化镓在集成微波功能中将发挥重要作用。但硅将在较低频率下占据主导地位,因为围绕它及其氧化物的技术已经发展起来,而且因为它是一种丰富且相对便宜的原材料。
Silicon is likely to remain the basic material, although others will be of use in specific applications. For example, gallium arsenide will be important in integrated microwave functions. But silicon will predominate at lower frequencies because of the technology which has already evolved around it and its oxide, and because it is an abundant and relatively inexpensive starting material.
降低成本是集成电子产品的一大吸引力,并且随着技术朝着在单个半导体衬底上生产越来越大的电路功能的方向发展,成本优势不断增强。对于简单电路,每个元件的成本几乎与元件数量成反比,这是等效封装中的等效半导体包含更多元件的结果。但随着元件的增加,产量的降低远远超过了复杂性增加的补偿,从而往往会提高每个元件的成本。因此,在技术发展的任何给定时间,成本都是最小的。目前,每个电路使用 50 个元件即可达到这一目标。但最低成本正在迅速上升,而整个成本曲线却在下降(见图)。如果我们展望未来五年,成本图表明,在每个电路大约有 1000 个元件的电路中,每个元件的成本可能是最低的(前提是可以中等数量生产此类电路功能)。1970 年,每个部件的制造成本预计仅为现在成本的十分之一。
Reduced cost is one of the big attractions of integrated electronics, and the cost advantage continues to increase as the technology evolves toward the production of larger and larger circuit functions on a single semiconductor substrate. For simple circuits, the cost per component is nearly inversely proportional to the number of components, the result of the equivalent piece of semiconductor in the equivalent package containing more components. But as components are added, decreased yields more than compensate for the increased complexity, tending to raise the cost per component. Thus there is a minimum cost at any given time in the evolution of the technology. At present, it is reached when 50 components are used per circuit. But the minimum is rising rapidly while the entire cost curve is falling (see graph). If we look ahead five years, a plot of costs suggests that the minimum cost per component might be expected in circuits with about 1000 components per circuit (providing such circuit functions can be produced in moderate quantities). In 1970, the manufacturing cost per component can be expected to be only a tenth of the present cost.
最低组件成本的复杂性以每年大约两倍的速度增加(见图)。当然,在短期内,这一比率即使不增加,也将持续下去。从长远来看,增长率有点不确定,尽管没有理由相信它至少在十年内不会保持几乎恒定。这意味着到 1975 年,每个集成电路以最低成本使用的元件数量将达到 65,000 个。
The complexity for minimum component costs has increased at a rate of roughly a factor of two per year (see graph). Certainly over the short term this rate can be expected to continue, if not to increase. Over the longer term, the rate of increase is a bit more uncertain, although there is no reason to believe it will not remain nearly constant for at least ten years. That means by 1975, the number of components per integrated circuit for minimum cost will be 65,000.
我相信这么大的电路可以构建在单个晶圆上。
I believe that such a large circuit can be built on a single wafer.
由于集成电路中已经采用了尺寸公差,因此可以在相距千分之二英寸的中心上构建隔离的高性能晶体管。这样的两密耳平方还可以包含几千欧姆的电阻或几个二极管。这允许每线性英寸至少有 500 个元件或每平方英寸 25 万个元件。因此,65,000 个元件只需占据大约四分之一平方英寸。
With the dimensional tolerances already being employed in integrated circuits, isolated high-performance transistors can be built on centers two-thousandths of an inch apart. Such a two-mil square can also contain several kilohms of resistance or a few diodes. This allows at least 500 components per linear inch or a quarter million per square inch. Thus, 65,000 components need occupy only about one-fourth a square inch.
在目前使用的硅晶片上,直径通常为一英寸或更大,如果元件可以紧密封装,并且不浪费空间用于互连图案,则有足够的空间用于这种结构。这是现实的,因为已经在努力使用由介电膜分隔的多层金属化图案来实现高于目前可用的集成电路的复杂程度。这样的元件密度可以通过现有的光学技术来实现,并且不需要更奇特的技术,例如电子束操作,人们正在研究电子束操作以制造更小的结构。
On the silicon wafer currently used, usually an inch or more in diameter, there is ample room for such a structure if the components can be closely packed with no space wasted for interconnection patterns. This is realistic, since efforts to achieve a level of complexity above the presently available integrated circuits are already under way using multilayer metallization patterns separated by dielectric films. Such a density of components can be achieved by present optical techniques and does not require the more exotic techniques, such as electron beam operations, which are being studied to make even smaller structures.
实现 100% 的器件良率不存在根本障碍。目前,封装成本远远超过半导体结构本身的成本,因此没有动力提高产量,但可以在经济上合理的情况下将其提高到尽可能高的水平。不存在与经常限制化学反应产量的热力学平衡考虑相比的障碍;甚至没有必要进行任何基础研究或取代现有流程。只需要工程工作。
There is no fundamental obstacle to achieving device yields of 100%. At present, packaging costs so far exceed the cost of the semiconductor structure itself that there is no incentive to improve yields, but they can be raised as high as is economically justified. No barrier exists comparable to the thermodynamic equilibrium considerations that often limit yields in chemical reactions; it is not even necessary to do any fundamental research or to replace present processes. Only the engineering effort is needed.
在集成电路的早期,当产量极低时,就有这样的激励。如今,普通集成电路的产量与现有的产量相当对于单个半导体器件。如果其他考虑因素使此类阵列更可取,则相同的模式将使更大的阵列变得经济。
In the early days of integrated circuitry, when yields were extremely low, there was such incentive. Today ordinary integrated circuits are made with yields comparable with those obtained for individual semiconductor devices. The same pattern will make larger arrays economical, if other considerations make such arrays desirable.
是否有可能消除单个硅芯片中数以万计的元件产生的热量?
Will it be possible to remove the heat generated by tens of thousands of components in a single silicon chip?
如果我们能够将标准高速数字计算机的体积缩小到组件本身所需的体积,我们预计它会在当前的功耗下发光。但对于集成电路来说,这种情况不会发生。由于集成电子结构是二维的,因此它们在靠近每个发热中心处具有可用于冷却的表面。此外,主要需要电力来驱动与系统相关的各种线路和电容。只要功能被限制在晶圆上的小区域内,必须驱动的电容量就明显受到限制。[编辑:热量成为一个问题,因为这个前提不再成立。]事实上,缩小集成结构的尺寸使得在单位面积相同功率的情况下以更高的速度运行该结构成为可能。
If we could shrink the volume of a standard high-speed digital computer to that required for the components themselves, we would expect it to glow brightly with present power dissipation. But it won’t happen with integrated circuits. Since integrated electronic structures are two dimensional, they have a surface available for cooling close to each center of heat generation. In addition, power is needed primarily to drive the various lines and capacitances associated with the system. As long as a function is confined to a small area on a wafer, the amount of capacitance which must be driven is distinctly limited. [EDITOR: Heat became a problem because this premise ceased to be true.] In fact, shrinking dimensions on an integrated structure makes it possible to operate the structure at higher speed for the same power per unit area.
显然,我们将能够建造这种充满组件的设备。接下来我们问什么情况下应该这样做。必须最小化实现特定系统功能的总成本。为此,我们可以将工程分摊到几个相同的项目上,或者开发用于大型功能工程的灵活技术,以便特定阵列不需要承担不成比例的费用。也许新设计的设计自动化程序可以从逻辑图转化为技术实现,而无需任何特殊的工程。
Clearly, we will be able to build such component-crammed equipment. Next, we ask under what circumstances we should do it. The total cost of making a particular system function must be minimized. To do so, we could amortize the engineering over several identical items, or evolve flexible techniques for the engineering of large functions so that no disproportionate expense need be borne by a particular array. Perhaps newly devised design automation procedures could translate from logic diagram to technological realization without any special engineering.
事实证明,用较小的功能构建大型系统可能更经济,这些功能是单独封装和互连的。大型功能的可用性与功能设计和构造相结合,将使大型系统制造商能够快速且经济地设计和构造多种设备。
It may prove to be more economical to build large systems out of smaller functions, which are separately packaged and interconnected. The availability of large functions, combined with functional design and construction, should allow the manufacturer of large systems to design and construct a considerable variety of equipment both rapidly and economically.
集成不会像数字系统那样从根本上改变线性系统。尽管如此,线性电路仍将实现相当程度的集成。缺乏大值电容器和电感器是线性领域集成电子器件的最大根本限制。
Integration will not change linear systems as radically as digital systems. Still, a considerable degree of integration will be achieved with linear circuits. The lack of large-value capacitors and inductors is the greatest fundamental limitation to integrated electronics in the linear area.
就其本质而言,这些元素需要在一定体积内存储能量。为了获得高 Q 值,体积必须很大。从术语本身来看,大容量和集成电子器件的不兼容性是显而易见的。某些谐振现象,例如压电晶体中的谐振现象,预计会在调谐功能方面有一些应用,但电感器和电容器将伴随我们一段时间。
By their very nature, such elements require the storage of energy in a volume. For high Q it is necessary that the volume be large. The incompatibility of large volume and integrated electronics is obvious from the terms themselves. Certain resonance phenomena, such as those in piezoelectric crystals, can be expected to have some applications for tuning functions, but inductors and capacitors will be with us for some time.
未来的集成射频放大器很可能由集成增益级组成,以最低的成本提供高性能,并散布着相对较大的调谐元件。
The integrated RF amplifier of the future might well consist of integrated stages of gain, giving high performance at minimum cost, interspersed with relatively large tuning elements.
其他线性函数将发生相当大的变化。集成结构中类似元件的匹配和跟踪将允许设计性能大大提高的差分放大器。利用热反馈效应将集成结构稳定到一小部分程度将允许构建具有晶体稳定性的振荡器。
Other linear functions will be changed considerably. The matching and tracking of similar components in integrated structures will allow the design of differential amplifiers of greatly improved performance. The use of thermal feedback effects to stabilize integrated structures to a small fraction of a degree will allow the construction of oscillators with crystal stability.
即使在微波领域,集成电子定义中包含的结构也将变得越来越重要。制造和组装比所涉及的波长更小的元件的能力将允许使用集总参数设计,至少在较低频率下。目前很难预测集成电子器件对微波领域的入侵将会有多大。例如,使用多个集成微波电源成功实现相控阵天线等项目可能会彻底改变雷达。
Even in the microwave area, structures included in the definition of integrated electronics will become increasingly important. The ability to make and assemble components small compared with the wavelengths involved will allow the use of lumped parameter design, at least at the lower frequencies. It is difficult to predict at the present time just how extensive the invasion of the microwave area by integrated electronics will be. The successful realization of such items as phased-array antennas, for example, using a multiplicity of integrated microwave power sources, could completely revolutionize radar.
经电气和电子工程师协会许可,转载自 Moore(1965 年,2006 年)。
Reprinted from Moore (1965, 2006), with permission from the Institute of Electrical and Electronics Engineers.
Edsger Dijkstra(1930-2002)是一位才华横溢且富有挑战性的人物。即使完成“Dijkstra 是荷兰人…… ”这句话也是一个挑战。他是一位极其敏锐的计算机编程思想家和优雅的作家。但他不喜欢“计算机科学”这个术语,如果必须使用这个术语的话,他更喜欢“计算科学”。他拒绝将计算分为理论和工程。他坚持让他接触到的一切都保持优雅和清晰。“编程的艺术是组织复杂性的艺术,是驾驭众多事物并尽可能有效地避免其混蛋混乱的艺术,”他写道(Dijkstra,1972)。如果这需要更多的脑力,那么,并不是每个人都适合这项工作。“不要责怪我,因为我认为合格的编程是一种智力上的可能性,但对于‘普通程序员’来说太困难了——你不能陷入拒绝外科手术技术的陷阱,因为它超出了你的能力范围。拐角处理发师的能力”(Dijkstra,1975)。
Edsger Dijkstra (1930–2002) was a brilliant and challenging figure. Even completing the sentence “Dijkstra was a Dutch …” is a challenge. He was a devastatingly trenchant thinker and elegant writer about computer programming. But he didn’t like the term “computer science,” preferring “computing science” if such a term had to be used at all. He rejected the bifurcation of computing into theory and engineering; he insisted on elegance and clarity in everything he touched. “The art of programming is the art of organizing complexity, of mastering multitude and avoiding its bastard chaos as effectively as possible,” he wrote (Dijkstra, 1972). And if that required more brainpower, well, not everyone was cut out for the work. “Don’t blame me for the fact that competent programming, as I view it as an intellectual possibility, will be too difficult for ‘the average programmer’—you must not fall into the trap of rejecting a surgical technique because it is beyond the capabilities of the barber in his shop around the corner” (Dijkstra, 1975).
Dijkstra 的职业生涯致力于将焦点从执行程序的机器转移到计算背后的抽象思维。作为一名数学家和物理学家,他向该领域提出挑战,要求在编写一行代码之前系统地思考编程问题,并以数学方式对其进行推理。他对该领域的许多方面做出了重要贡献,最初是作为计算机行业的实践程序员,后来担任荷兰埃因霍温科技大学和德克萨斯大学奥斯汀分校的教授。他的名字与优雅的全对最短路径算法相关(Dijkstra,1959),他还对编译器、操作系统(见第 28 章)、分布式系统和编程方法(见第 29 章)的研究做出了重要贡献。
Dijkstra’s career was spent trying to move the focus from the machine that executed a program to the abstract thinking behind the computation. Trained as a mathematician and a physicist, he challenged the field to think methodically about programming problems, and to reason about them mathematically, before writing a line of code. He made important contributions to many aspects of the field, originally as a practicing programmer in the computer industry, and later as a professor at the Technological University in Eindhoven in the Netherlands and at the University of Texas at Austin. His name is associated with an elegant all-pairs shortest paths algorithm (Dijkstra, 1959), and he also made important contributions to the study of compilers, operating systems (see chapter 28), distributed systems, and programming methods (see chapter 29).
这一选择可以说是并发编程(其他人可能称之为并行计算或多道编程)领域中关于算法问题(而不是语言构造)的第一篇科学论文。这篇论文是逻辑严谨的典范,尽管代码远非自我记录。它掀起了整个并发计算研究领域的热潮。操作系统和数据库系统尤其具有这样的特性:由于并发代码段在如此多的不同条件下执行如此多次,所以任何可能出错的事情最终都会出错。Dijkstra 在这里指出了使用清晰的推理来排除计时错误的方法。他引入了术语“临界区”和“互斥”,并证明他的代码满足现在所谓的安全性和活跃性条件(Lamport,2015)。
This selection is arguably the first scientific paper on algorithmic problems—as opposed to language constructs—in the field of concurrent programming (which others might call parallel computing or multiprogramming). The paper is a model of logical rigor, though the code is far from self-documenting. It set off the entire research field of concurrent computing. Operating systems and database systems in particular have the property that because concurrent code segments are executed so many times under so many different conditions, anything that can possibly go wrong eventually will. Dijkstra here points the way to the use of clear reasoning to preclude timing bugs. He introduces the terms “critical section” and “mutual exclusion,” and proves that his code meets what would now be called safety and liveness conditions (Lamport, 2015).
尽管迪杰斯特拉为严谨的思维设定了标准,但他对不符合他标准的工作不屑一顾——包括整个研究项目。如果不理解他对缺乏逻辑清晰的工作的态度,就无法完全理解他对逻辑清晰的要求。1985 年,当被问及人工智能的未来时,他回答说:“你能研究非科学的东西吗?我觉得用机器来模仿人类推理的做法既愚蠢又危险。这是愚蠢的,因为如果你按原样看待人类的推理,就会发现它非常糟糕。即使是训练有素的数学家也是业余思想家。……任何成功的人工智能项目本质上都会阉割机器。” 当被问及学生对计算机科学的兴趣大幅增加时,他回答说:“如果你问学生是否太多或太少:太多了一个数量级。从科学的角度来看,你想淘汰掉这批东西。保留最聪明的 2% 并开展业务”(van Vlissingen 和 Dijkstra,1985)。
As much as Dijkstra set a standard for rigorous thinking, he was dismissive of work that did not meet his standards—including entire research programs. One can’t fully appreciate his imperative for logical clarity without understanding his attitude toward work that lacked it. Asked in 1985 about the future of artificial intelligence, he replied, “Can you research something that is not science? I feel that the effort to use machines to try to mimic human reasoning is both foolish and dangerous. It is foolish because if you look at human reasoning as is, it is pretty lousy; even the most trained mathematicians are amateur thinkers. … Any successful AI project by its very nature would castrate the machine.” Asked about the huge increase in student interest in computer science, he replied, “If you ask whether there are too many or too few students: an order of magnitude too many. From a scientific point of view you would like to weed out the lot. Keep the brightest 2% and do business” (van Vlissingen and Dijkstra, 1985).
关于使软件更可用的重要性:“在计算机产品开发中发挥作用的计算机用户不是一个有血有肉的真人,而是一个文学人物……。他很愚蠢,即使没有教育证明,也抗拒教育,他讨厌对他提出的任何形式的智力要求,他不能对美丽的事物感到高兴,因为他缺乏欣赏美的教育。计算机科学的大部分部门都因为接受这个白痴作为他们的典型客户而陷入瘫痪。”
On the importance of making software more usable: “[T]he computer user, as functioning in the development of computer products is not a real person of flesh and blood but a literary figure …. He is stupid, education resistant if not education proof, and he hates any form of intellectual demand made on him, he cannot be delighted by something beautiful, because he lacks the education to appreciate beauty. Large sections of computer science are paralyzed by accepting this moron as their typical customer.”
本文给出的问题的解决方案据作者所知,至少自 1962 年以来一直是一个悬而未决的问题,无论其可解决性如何。论文由三部分组成:问题、解决方案和证明。尽管这个问题的设置乍一看似乎有点学术性,但作者相信任何熟悉计算机耦合中出现的逻辑问题的人都会理解这个问题确实可以解决这一事实的重要性。
GIVEN in this paper is a solution to a problem for which, to the knowledge of the author, has been an open question since at least 1962, irrespective of the solvability. The paper consists of three parts: the problem, the solution, and the proof. Although the setting of the problem might seem somewhat academic at first, the author trusts that anyone familiar with the logical problems that arise in computer coupling will appreciate the significance of the fact that this problem indeed can be solved.
首先,考虑N 台计算机,每台计算机都参与一个过程,对于我们的目标来说,该过程可以被视为循环。在每个循环中,都会出现一个所谓的“临界区”,并且计算机必须以这样的方式进行编程:在任何时刻,这 N 个循环过程中只有一个处于其临界区。为了实现临界区执行的这种互斥,计算机可以通过公共存储相互通信。向该存储中写入一个字或从该存储中非破坏性地读取一个字是不可分割的操作;即,当两台或多台计算机尝试同时与同一公共位置进行通信(读取或写入)时,这些通信将相继发生,但顺序未知。
To begin, consider N computers, each engaged in a process which, for our aims, can be regarded as cyclic. In each of the cycles a so-called “critical section” occurs and the computers have to be programmed in such a way that at any moment only one of these N cyclic processes is in its critical section. In order to effectuate this mutual exclusion of critical-section execution the computers can communicate with each other via a common store. Writing a word into or nondestructively reading a word from this store are undividable operations; i.e., when two or more computers try to communicate (either for reading or for writing) simultaneously with the same common location, these communications will take place one after the other, but in an unknown order.
该解决方案必须满足以下要求。
The solution must satisfy the following requirements.
(a) N台计算机之间的解必须是对称的;因此,我们不允许引入静态优先级。
(a) The solution must be symmetrical between the N computers; as a result we are not allowed to introduce a static priority.
(b) 不可假设N台计算机的相对速度;我们甚至可能不假设它们的速度随时间恒定。
(b) Nothing may be assumed about the relative speeds of the N computers; we may not even assume their speeds to be constant in time.
(c) 如果任何计算机在其临界区之外停止运行,则不允许导致其他计算机的潜在阻塞。
(c) If any of the computers is stopped well outside its critical section, this is not allowed to lead to potential blocking of the others.
(d) 如果有多于一台计算机即将进入其临界区,则不可能为它们设计出如此有限的速度,以至于确定哪一台计算机将首先进入其临界区的决定被推迟到永远。换句话说,“在你之后”-“在你之后”阻塞仍然可能的结构,尽管不太可能,但不应被视为有效的解决方案。
(d) If more than one computer is about to enter its critical section, it must be impossible to devise for them such finite speeds, that the decision to determine which one of them will enter its critical section first is postponed until eternity. In other words, constructions in which “After you”-“After you”-blocking is still possible, although improbable, are not to be regarded as valid solutions.
我们恳请受到挑战的读者在这里停下来自己尝试一下,因为这似乎是了解每台计算机一次只能请求一条单向消息这一事实所带来的棘手后果的唯一方法。只有这样,读者才会意识到这个问题绝非微不足道。
We beg the challenged reader to stop here for a while and have a try himself, for this seems the only way to get a feeling for the tricky consequences of the fact that each computer can only request one one-way message at a time. And only this will make the reader realize to what extent this problem is far from trivial.
公共存储由以下部分组成:“布尔数组 b, c [1: N ]; 整数 k ”。
The common store consists of: “boolean array b, c[1: N]; integer k”.
整数k满足1≤k≤N , b [ i ]和c [ i ]只能由第i台计算机设置;他们将接受其他人的检查。假设所有计算机都在其关键部分之外启动,并且所有提到的布尔数组都设置为true;k的起始值并不重要。
The integer k will satisfy 1 ≤ k ≤ N, b[i] and c[i] will only be set by the ith computer; they will be inspected by the others. It is assumed that all computers are started well outside their critical sections with all boolean arrays mentioned set to true; the starting value of k is immaterial.
第 i台计算机 (1 ≤ i ≤ N )的程序为:
The program for the ith computer (1 ≤ i ≤ N) is:
我们首先观察到该解决方案是安全的,因为没有两台计算机可以同时处于其临界区。因为进入其临界区的唯一方法是执行复合语句Li 4 而不跳回Li 1,即在将自己的c设置为false后找到所有其他c的true。
We start by observing that the solution is safe in the sense that no two computers can be in their critical section simultaneously. For the only way to enter its critical section is the performance of the compound statement Li4 without jumping back to Li1, i.e., finding all other c’s true after having set its own c to false.
证明的第二部分必须表明不会发生无限的“在你之后”-“在你之后”阻塞;即,当没有计算机处于其临界区时,循环(即跳回Li 1)的计算机中的至少一台(因此正好一台)将被允许在适当的时间进入其临界区。
The second part of the proof must show that no infinite “After you”-“After you”-blocking can occur; i.e., when none of the computers is in its critical section, of the computers looping (i.e., jumping back to Li1) at least one—and therefore exactly one—will be allowed to enter its critical section in due time.
如果第 k台计算机不在循环计算机中,则b [ k ] 将为true,并且循环计算机都会发现k ≠ i。结果,他们中的一个或多个将在Li 3 中发现布尔值b [ k ] true,因此一个或多个将决定分配“ k := i ”。在第一次赋值“ k := i ”之后,b [ k ] 变为false,并且没有新计算机可以再次决定为k分配新值。当所有对k 的决定分配完成后,k将指向其中一台循环计算机,并且暂时不会改变其值,即直到b [ k ] 变为true,即直到第k台计算机完成为止它的关键部分。一旦k的值不再改变,第k台计算机就会等待(通过复合语句Li 4),直到所有其他c都为true,但这种情况肯定会出现(如果尚未出现的话),因为所有其他循环则被迫将其c设置 为 true,因为它们会发现k ≠ i。作者认为,这完成了证明。
If the kth computer is not among the looping ones, b[k] will be true and the looping ones will all find k ≠ i. As a result one or more of them will find in Li3 the boolean b[k] true and therefore one or more will decide to assign “k:= i”. After the first assignment “k:= i”, b[k] becomes false and no new computers can decide again to assign a new value to k. When all decided assignments to k have been performed, k will point to one of the looping computers and will not change its value for the time being, i.e., until b[k] becomes true, viz., until the kth computer has completed its critical section. As soon as the value of k does not change any more, the kth computer will wait (via the compound statement Li4) until all other c’s are true, but this situation will certainly arise, if not already present, because all other looping ones are forced to set their c true, as they will find k ≠ i. And this, the author believes, completes the proof.
经计算机协会许可,转载自 Dijkstra (1965)。
Reprinted from Dijkstra (1965), with permission from the Association for Computing Machinery.
约瑟夫·魏森鲍姆(Joseph Weizenbaum,1923-2008)是一名德国犹太难民,13 岁时随家人来到美国。在韦恩州立大学学习数学和计算机后,他加入了麻省理工学院计算机科学系。从 1964 年开始,他写了第一个我们现在所说的聊天机器人的例子——这些程序“知道”得很少,但通过操纵对话伙伴的话语来创造一种对话的错觉。他很惊讶人们对他的简单程序的热情投入,就好像它是人类一样。众所周知,他自己的助理比任何人都清楚没有人能回答她的沉思,当她使用 E LIZA时,他要求魏森鲍姆离开房间(Weizenbaum,1976,第 6 页)。就好像她以为他在偷听私人谈话一样。
Joseph Weizenbaum (1923–2008) was a German Jewish refugee who came to the United States with his family at the age of 13. After studying mathematics and computing at Wayne State University, he joined the MIT faculty in computer science. There, starting in 1964, he wrote the first example of what we would now call chatbots—programs that “know” very little but create an illusion of conversation by manipulating the discourse of their conversational partners. He was surprised that people engaged intensely with his simple program, as though it was human. Famously, his own staff assistant, who knew better than anyone that no human being was answering her musings, asked Weizenbaum to leave the room while she was using ELIZA (Weizenbaum, 1976, p. 6). It was as though she thought he was eavesdropping on a personal conversation.
E LIZA引起了轰动,部分原因是分时技术在 1966 年还是个新事物,以至于 Weizenbaum 觉得他必须在这篇论文中向《ACM 通讯》的读者解释它。第一次,没有受过技术培训的人开始使用计算机,程序员开始编写旨在让普通人获得乐趣的程序。
ELIZA was a sensation, in part because time-sharing was new in 1966—so new that in this paper Weizenbaum felt he had to explain it to the readers of the Communications of the ACM. For the first time, people with no technical training were starting to use computers, and programmers started to write programs designed, in no small measure, to allow ordinary people to have some fun.
但《E LIZA》也触动了人类深处的神经。该节目以萧伯纳戏剧《皮格马利翁》中的角色伊丽莎·杜利特尔 (Eliza Doolittle) 命名,该剧于 1956 年成为百老汇音乐剧《窈窕淑女》。根据这部音乐剧改编的电影于 1964 年上映。在萧伯纳的戏剧中,伊丽莎是一位未受过教育的伦敦卖花姑娘。他被语言学家亨利·希金斯教授“重新编程”,假装出身贵族。希金斯爱上了(几乎)完美改造的伊丽莎,就像在最初的希腊神话中,艺术家皮格马利翁爱上了他雕刻的象牙女人雕像一样。
But ELIZA also touched a deep human nerve. The program is named after Eliza Doolittle, a character in George Bernard Shaw’s play Pygmalion, which became the Broadway musical My Fair Lady in 1956. A film based on the musical was released in 1964. In Shaw’s play, Eliza is an unschooled London flower girl who is “reprogrammed” by Professor Henry Higgins, a linguist, to feign aristocratic roots. Higgins falls in love with the (almost) perfectly transformed Eliza, in the same way that in the original Greek myth, the artist Pygmalion falls in love with the ivory statue he has sculpted of a woman.
事实上,希腊神话中的皮格马利翁比现代戏剧更接近《伊莉莎》的现实,因为它涉及无生命的动画。在原著中,皮格马利翁的祈祷得到了回应,众神为他的雕像注入了生命。这只是西方关于无生命物体以人的形式复活的神话之一(见第十九页)。
The Greek myth of Pygmalion is in fact closer than the modern drama to the reality of ELIZA, because it involves the animation of the inanimate. In the original, Pygmalion’s prayers are answered and the gods breathe life into his statue. This is only one of the Western myths of an inanimate object being brought to life in human form (see page xix).
魏森鲍姆小时候目睹了人类的非人性化,他对人们如此容易被愚弄深感不安,并对同事们使机器人性化的科学议程持怀疑态度。他一生都是人工智能的尖锐批评者。他最重要的作品是他试图将人类和机器一劳永逸地分开,其标题是“计算机能力与人类理性”(Weizenbaum,1976)。它并没有说服人工智能的倡导者,随着理解和合成语音以及模拟情感技术的成熟,关于计算机应该(而不是可以)取代人类交互的争论仍在继续。
Having witnessed as a boy the dehumanization of human beings, Weizenbaum was deeply troubled that people were so easily fooled, and skeptical of his colleagues’ scientific agenda to humanize machines. He was a sharp critic of artificial intelligence throughout his life; his most important work, his attempt to separate humans and machines once and for all, was entitled Computer Power and Human Reason (Weizenbaum, 1976). It did not convince AI advocates, and with the maturing of technologies of understanding and synthesizing speech and simulating emotion, the debate has continued about where computers should—as opposed to can—replace human interactions.
E LIZA是一个在麻省理工学院的 MAC 分时系统中运行的程序,它使人与计算机之间的某些自然语言对话成为可能。根据输入文本中出现的关键词触发的分解规则对输入句子进行分析。响应由与所选分解规则相关联的重组规则生成。E LIZA关注的基本技术问题是:(1)关键词的识别,(2)最小上下文的发现,(3)适当转换的选择,(4)在没有关键词的情况下生成响应(5) 为 E LIZA “脚本”提供编辑功能。本文最后讨论了与 E LIZA方法相关的一些心理问题以及未来的发展。
ELIZA is a program operating within the MAC time-sharing system at MIT which makes certain kinds of natural language conversation between man and computer possible. Input sentences are analyzed on the basis of decomposition rules which are triggered by key words appearing in the input text. Responses are generated by reassembly rules associated with selected decomposition rules. The fundamental technical problems with which ELIZA is concerned are: (1) the identification of key words, (2) the discovery of minimal context, (3) the choice of appropriate transformations, (4) generation of responses in the absence of key words, and (5) the provision of an editing capability for ELIZA “scripts.” A discussion of some psychological issues relevant to the ELIZA approach as well as of future developments concludes the paper.
有人说,解释就是解释掉。这句格言在计算机编程领域得到了最好的体现,尤其是在所谓的启发式编程和人工智能领域。因为在这些领域中,机器的行为方式非常奇妙,即使是最有经验的观察者也常常会感到眼花缭乱。但是,一旦一个特定的程序被揭开面纱,一旦它的内部工作原理被用足够简单的语言解释以促进理解,它的魔力就会崩溃。它只是一系列程序的集合,每个程序都很容易理解。观察者对自己说:“我本来可以这样写的。” 带着这样的想法,他把有问题的程序从标有“智能”的架子上移到了为古玩保留的架子上,只适合与比他不那么开明的人讨论。
It is said that to explain is to explain away. This maxim is nowhere so well fulfilled as in the area of computer programming, especially in what is called heuristic programming and artificial intelligence. For in those realms machines are made to behave in wondrous ways, often sufficient to dazzle even the most experienced observer. But once a particular program is unmasked, once its inner workings are explained in language sufficiently plain to induce understanding, its magic crumbles away; it stands revealed as a mere collection of procedures, each quite comprehensible. The observer says to himself “I could have written that.” With that thought he moves the program in question from the shelf marked “intelligent,” to that reserved for curios, fit to be discussed only with people less enlightened than he.
本文的目的就是引起对即将“解释”的程序的重新评估。很少有程序比这更需要它了。
The object of this paper is to cause just such a re-evaluation of the program about to be “explained.” Few programs ever needed it more.
E LIZA是一个可以与计算机进行自然语言对话的程序。它目前的实现是在麻省理工学院的 MAC 分时系统上。它是用 MAD-SLIP(Weizenbaum,1963)为 IBM 7094 编写的。选择它的名称是为了强调它可以由用户逐步改进,因为它的语言能力可以由“老师”不断改进。就像《皮格马利翁》中的伊丽莎一样,它可以显得更加文明,但表象与现实的关系仍然属于剧作家的范畴。
ELIZA is a program which makes natural language conversation with a computer possible. Its present implementation is on the MAC time-sharing system at MIT. It is written in MAD-SLIP (Weizenbaum, 1963) for the IBM 7094. Its name was chosen to emphasize that it may be incrementally improved by its users, since its language abilities may be continually improved by a “teacher.” Like the Eliza of Pygmalion fame, it can be made to appear even more civilized, the relation of appearance to reality, however, remaining in the domain of the playwright.
就目前的目的而言,将MAC系统描述为允许个人通过远程打字机操作全尺寸计算机的系统就足够了。个体操作员有一种错觉,认为他是计算机综合体的唯一用户,而实际上其他人可能是与他“分时”系统。这里重要的是计算机可以读取打字机上输入的消息并通过在同一台仪器上书写来做出响应。计算机收到消息和出现响应之间的时间是控制对话的程序和诸如当前与系统对应的用户数量等MAC系统参数的函数。后面这些参数通常对整体响应时间的贡献很小,以至于与计算机的会话交互永远不会涉及真正无法忍受的延迟。
For the present purpose it is sufficient to characterize the MAC system as one which permits an individual to operate a full scale computer from a remotely located typewriter. The individual operator has the illusion that he is the sole user of the computer complex, while in fact others may be “time-sharing” the system with him. What is important here is that the computer can read messages typed on the typewriter and respond by writing on the same instrument. The time between the computer’s receipt of a message and the appearance of its response is a function of the program controlling the dialogue and of such MAC system parameters as the number of users currently corresponding with the system. These latter parameters generally contribute so little to the overall response time that conversational interaction with the computer need never involve truly intolerable delays.
当与 E LIZA对话时,用户使用正常的标点符号和句子结构以自然语言键入一些语句或一组语句。只有问号不能使用,因为它被 MAC 系统解释为行删除字符。用户的语句以双回车结束,用于将控制权移交给 E LIZA。然后, E LIZA分析用户的陈述并生成一些响应并打印出来。然后控制权再次由用户掌控。典型的对话如下:
When in conversation with ELIZA, the user types in some statement or set of statements in natural language using normal punctuation and sentence structures. Only the question mark may not be used, because it is interpreted as a line delete character by the MAC system. The user’s statement is terminated by a double carriage return which serves to turn control over to ELIZA. ELIZA then analyzes the user’s statement and generates some response which it types out. Control is then again with the user. A typical conversation is the following:
男人都是一样的。
Men are all alike.
以什么方式
IN WHAT WAY
他们总是因为某事或其他事来骚扰我们。
They’re always bugging us about something or other.
你能想到一个具体的例子吗
CAN YOU THINK OF A SPECIFIC EXAMPLE
嗯,是我男朋友让我来这里的。
Well, my boyfriend made me come here.
你的男朋友让你来这里的
YOUR BOYFRIEND MADE YOU COME HERE
他说我很多时候都很沮丧。
He says I’m depressed much of the time.
我很遗憾听到你很沮丧
I AM SORRY TO HEAR YOU ARE DEPRESSED
这是真的。我不开心。
It’s true. I am unhappy.
你认为来这里会帮助你不再不快乐吗
DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY
我需要一些帮助,这似乎是肯定的。
I need some help, that much seems certain.
如果您得到一些帮助,这对您意味着什么
WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP
也许我可以学会和妈妈相处。
Perhaps I could learn to get along with my mother.
告诉我更多关于您家庭的信息
TELL ME MORE ABOUT YOUR FAMILY
我妈妈照顾我。
My mother takes care of me.
您家里还有谁照顾您
WHO ELSE IN YOUR FAMILY TAKES CARE OF YOU
我的父亲。
My father.
你的父亲
YOUR FATHER
在某些方面你很像我的父亲。
You are like my father in some ways.
您看到什么相似之处
WHAT RESEMBLANCE DO YOU SEE
你不是很有攻击性,但我想你不想让我注意到这一点。
You are not very aggressive but I think you don’t want me to notice that.
是什么让你认为我不太有攻击性
WHAT MAKES YOU THINK I AM NOT VERY AGGRESSIVE
你别跟我争论。
You don’t argue with me.
你为什么认为我不与你争论
WHY DO YOU THINK I DON’T ARGUE WITH YOU
你怕我。
You are afraid of me.
你愿意相信我害怕你吗
DOES IT PLEASE YOU TO BELIEVE I AM AFRAID OF YOU
我父亲害怕所有人。
My father is afraid of everybody.
当你想到你的父亲时,你还会想到什么
WHAT ELSE COMES TO MIND WHEN YOU THINK OF YOUR FATHER
恶霸。
Bullies.
这和你男朋友让你来这里有什么关系吗
DOES THAT HAVE ANYTHING TO DO WITH THE FACT THAT YOUR BOYFRIEND MADE YOU COME HERE
大写的行是机器响应。
The capitalized lines are the machine responses.
该程序的总体流程非常简单;读取文本并检查是否存在关键字。如果找到这样的单词,则根据与关键字相关联的规则来转换句子,如果不是无内容的注释,或者在某些条件下,则检索较早的转换。然后打印出如此计算或检索的文本。
The gross procedure of the program is quite simple; the text is read and inspected for the presence of a keyword. If such a word is found, the sentence is transformed according to a rule associated with the keyword, if not a content-free remark or, under certain conditions, an earlier transformation is retrieved. The text so computed or retrieved is then printed out.
当然,详细而言,上面概述的过程要复杂得多。例如,关键字可能具有排名或优先级编号。该过程对这些数字很敏感,因为它将放弃在文本的从左到右扫描中已经找到的关键字,转而使用具有更高排名的关键字。此外,该过程将逗号或句点识别为分隔符。每当遇到任一关键字并且已经找到关键字时,所有后续文本都会从输入消息中删除。如果尚未找到键,则删除分隔符左侧的短语或句子(以及分隔符本身)。结果,只有单个短语或句子被转换。
In detail, of course, the procedure sketched above is considerably more complex. Keywords, for example, may have a RANK or precedence number. The procedure is sensitive to such numbers in that it will abandon a keyword already found in the left-to-right scan of the text in favor of one having a higher rank. Also, the procedure recognizes a comma or a period as a delimiter. Whenever either one is encountered and a keyword has already been found, all subsequent text is deleted from the input message. If no key had yet been found the phrase or sentence to the left of the delimiter (as well as the delimiter itself) is deleted. As a result, only single phrases or sentences are ever transformed.
关键字及其相关的转换规则构成了特定类别对话的脚本。E LIZA的一个重要特性是脚本就是数据;即,它不是程序本身的一部分。因此,E LIZA不限于一组特定的识别模式或响应,甚至不限于任何特定的语言。E LIZA脚本(在撰写本文时)有威尔士语、德语以及英语版本。
Keywords and their associated transformation rules constitute the SCRIPT for a particular class of conversation. An important property of ELIZA is that a script is data; i.e., it is not part of the program itself. Hence, ELIZA is not restricted to a particular set of recognition patterns or responses, indeed not even to any specific language. ELIZA scripts exist (at this writing) in Welsh and German as well as in English.
E LIZA必须关注的基本技术问题如下:
The fundamental technical problems with which ELIZA must be preoccupied are the following:
1. 识别输入消息中“最重要”的关键字。
1. The identification of the “most important” keyword in the input message.
2. 识别所选关键词出现的一些最小上下文;例如,如果关键字是“you”,那么它后面是否跟着单词“are”(在这种情况下可能会做出断言)。
2. The identification of some minimal context within which the chosen keyword appears; e.g., if the keyword is “you,” is it followed by the word “are” (in which case an assertion is probably being made).
3. 选择适当的转换规则,当然还有转换本身的制定。
3. The choice of an appropriate transformation rule and, of course, the making of the transformation itself.
4. 提供一种机制,允许 E LIZA在输入文本不包含关键字时进行“智能”响应。
4. The provision of mechanism that will permit ELIZA to respond “intelligently” when the input text contained no keywords.
5. 提供在脚本编写级别上促进脚本编辑、特别是扩展的机制。
5. The provision of machinery that facilitates editing, particularly extension, of the script on the script writing level.
当然,存在一些通常的限制,即需要经济地使用计算机时间和存储空间。
There are, of course, the usual constraints dictated by the need to be economical in the use of computer time and storage space.
核心问题显然是文本操作,而该问题的核心是转换规则的概念,据说该概念与某些关键字相关。“转换规则”口号下包含的机制是许多 SLIP 函数,这些函数用于 (1) 根据某些标准分解数据字符串,从而测试该字符串是否满足这些标准,以及 (2)根据一定的组装规范重新组装分解的弦。……
The central issue is clearly one of text manipulation, and at the heart of that issue is the concept of the transformation rule which has been said to be associated with certain keywords. The mechanisms subsumed under the slogan “transformation rule” are a number of SLIP functions which serve to (1) decompose a data string according to certain criteria, hence to test the string as to whether it satisfies these criteria or not, and (2) to reassemble a decomposed string according to certain assembly specifications. …
在撰写本文时,唯一存在的严肃的 E LIZA脚本是一些导致 E LIZA做出粗略反应的脚本,就像某些心理治疗师(Rogerians)那样。当人类通讯员最初被指示通过打字机与它“交谈”时,E LIZA表现最佳,就像人们与精神病医生交谈一样。选择这种对话模式是因为精神病学访谈是分类二元自然语言交流的少数例子之一,其中参与的一对可以自由地摆出对现实世界几乎一无所知的姿势。例如,如果一个人告诉一位精神科医生“我去坐了很长一段船”,而他回答“告诉我关于船的事”,人们不会认为他对船一无所知,而是认为他这样指导是有某种目的的。随后的谈话。值得注意的是,这一假设是演讲者做出的。它是否现实是一个完全不同的问题。
At this writing, the only serious ELIZA scripts which exist are some which cause ELIZA to respond roughly as would certain psychotherapists (Rogerians). ELIZA performs best when its human correspondent is initially instructed to “talk” to it, via the typewriter of course, just as one would to a psychiatrist. This mode of conversation was chosen because the psychiatric interview is one of the few examples of categorized dyadic natural language communication in which one of the participating pair is free to assume the pose of knowing almost nothing of the real world. If, for example, one were to tell a psychiatrist “I went for a long boat ride” and he responded “Tell me about boats,” one would not assume that he knew nothing about boats, but that he had some purpose in so directing the subsequent conversation. It is important to note that this assumption is one made by the speaker. Whether it is realistic or not is an altogether separate question.
无论如何,它具有至关重要的心理效用,因为它可以帮助说话者保持被听到和理解的感觉。说话者通过将各种背景知识、见解和推理能力归因于他的对话伙伴,进一步捍卫他的印象(甚至可能是虚幻的)。但同样,这些都是演讲者对对话的贡献。它们在他对所提供的回答做出的解释中以推论的方式表现出来。从纯粹的技术编程角度来看,E LIZA脚本的精神病学访谈形式的优点是它消除了存储有关现实世界的明确信息的需要。
In any case, it has a crucial psychological utility in that it serves the speaker to maintain his sense of being heard and understood. The speaker further defends his impression (which may even be illusory) by attributing to his conversational partner all sorts of background knowledge, insights, and reasoning ability. But again, these are the speaker’s contribution to the conversation. They manifest themselves inferentially in interpretations he makes of the offered responses. From the purely technical programming point of view then, the psychiatric interview form of an ELIZA script has the advantage that it eliminates the need of storing explicit information about the real world.
正如已经说过的,人类说话者将为 E LIZA的反应披上合理的外衣做出很大贡献。但他不会克服一切困难来捍卫他的幻想(他被理解)。在人类对话中,说话者会对他的对话伙伴做出某些(也许是慷慨的)假设。只要仍然可以根据这些假设来解释后者的反应,说话者对其伴侣的形象就保持不变,特别是未受损。难以如此解释的回答很可能会提高伴侣的形象,并进行额外的合理化,从而使对其回答的更复杂的解释变得合理。
The human speaker will, as has been said, contribute much to clothe ELIZA’s responses in vestments of plausibility. But he will not defend his illusion (that he is being understood) against all odds. In human conversation a speaker will make certain (perhaps generous) assumptions about his conversational partner. As long as it remains possible to interpret the latter’s responses consistently with those assumptions, the speaker’s image of his partner remains unchanged, in particular, undamaged. Responses which are difficult to so interpret may well result in an enhancement of the image of the partner, in additional rationalizations which then make more complicated interpretations of his responses reasonable.
然而,当这种合理化变得过于庞大甚至自相矛盾时,整个形象可能会崩溃并被另一个形象所取代(“他毕竟没有我想象的那么聪明”)。当对话伙伴是一台机器时(机器和程序之间的区别在这里没有用),那么可信度的概念很可能会取代上面的合理性的概念。
When, however, such rationalizations become too massive and even self-contradictory, the entire image may crumble and be replaced by another (“He is not, after all, as smart as I thought he was”). When the conversational partner is a machine (the distinction between machine and program is here not useful) then the idea of credibility may well be substituted for that of plausibility in the above.
以 E LIZA作为基本工具,可以进行实验,让受试者相信打字机上出现的反应是由坐在另一个房间的类似仪器前的人产生的。剧本必须怎样写才能长期保持这个想法的可信度?如何系统地降低 E LIZA的性能,以实现受试者的可信度的受控和可预测阈值?在所有这一切中,对主体的初始指导的作用是什么?另一方面,假设受试者被告知他正在与机器通信。根据他与机器的对话经历,他对这台机器有什么看法?有些受试者很难相信 E LIZA(及其现有脚本)不是人类。这是图灵测试的一种引人注目的形式。什么样的实验设计可以使它更加严格和无懈可击?
With ELIZA as the basic vehicle, experiments may be set up in which the subjects find it credible to believe that the responses which appear on his typewriter are generated by a human sitting at a similar instrument in another room. How must the script be written in order to maintain the credibility of this idea over a long period of time? How can the performance of ELIZA be systematically degraded in order to achieve controlled and predictable thresholds of credibility in the subject? What, in all this, is the role of the initial instruction to the subject? On the other hand, suppose the subject is told he is communicating with a machine. What is he led to believe about the machine as a result of his conversational experience with it? Some subjects have been very hard to convince that ELIZA (with its present script) is not human. This is a striking form of Turing’s test. What experimental design would make it more nearly rigorous and airtight?
机器输出(对人类而言)的可信度的整个问题需要调查。重要决策越来越倾向于根据计算机输出做出。对“机器所说的内容”负有最终责任的人类解释者与 E LIZA的通讯员并无不同,他们不断面临着做出可信度判断的需要。至少,《E LIZA》表明,创造和维持理解的幻觉是多么容易,因此也许是值得可信的判断。那里潜伏着某种危险。
The whole issue of the credibility (to humans) of machine output demands investigation. Important decisions increasingly tend to be made in response to computer output. The ultimately responsible human interpreter of “What the machine says” is not unlike the correspondent with ELIZA, constantly faced with the need to make credibility judgments. ELIZA shows, if nothing else, how easy it is to create and maintain the illusion of understanding, hence perhaps, of judgment deserving of credibility. A certain danger lurks there.
当前的 E LIZA脚本不包含有关现实世界的信息的想法并不完全正确。例如,导致输入的转换规则
The idea that the present ELIZA script contains no information about the real world is not entirely true. For example, the transformation rules which cause the input
每个人都讨厌我
Everybody hates me
转变为
to be transformed to
你能想到一个特别的人吗
Can you think of anyone in particular
以及其他类似的假设都是基于对世界的非常具体的假设。整个剧本以松散的方式构成了世界某些方面的模型。编写脚本的行为是一种编程行为,具有编程的所有优点,尤其是它清楚地表明了程序员对其主题的理解和掌握的不足之处。
and other such are based on quite specific hypotheses about the world. The whole script constitutes, in a loose way, a model of certain aspects of the world. The act of writing a script is a kind of programming act and has all the advantages of programming, most particularly that it clearly shows where the programmer’s understanding and command of his subject leaves off.
E LIZA的优雅很大程度上归功于E LIZA用如此少的机械维持了理解的幻觉。但 E LIZA的“理解”能力的可扩展性是有限的,这是 E LIZA程序本身的功能,而不是它可能提供的任何脚本的功能。每个老师都应该知道,理解的关键考验不是受试者继续对话的能力,而是从所讲内容中得出有效结论的能力。为了使计算机程序能够做到这一点,它至少必须有能力存储其输入的选定部分。E LIZA抛弃了它的每一个输入,除了那些通过记忆机制转换的少数输入。[编辑:一些输入保存在内存数据结构中,以便当对话框似乎逐渐消失时,可以恢复用户之前在对话中提到的内容。] 当然,问题不仅仅在于存储。事实上,其中很大一部分都包含在上面使用的“选定”一词之下。迄今为止,ELIZA的使用主要目的之一就是掩盖其缺乏理解。但为了鼓励对话伙伴提供意见它可以选择补救信息,它,必须揭露它的误解。从隐藏误解到揭露误解的目标转换被视为使类似 E LIZA的程序成为有效的自然语言人机通信系统基础的先决条件。
A large part of whatever elegance may be credited to ELIZA lies in the fact that ELIZA maintains the illusion of understanding with so little machinery. But there are bounds on the extendability of ELIZA’s “understanding” power, which are a function of the ELIZA program itself and not a function of any script it may be given. The crucial test of understanding, as every teacher should know, is not the subject’s ability to continue a conversation, but to draw valid conclusions from what he is being told. In order for a computer program to be able to do that, it must at least have the capacity to store selected parts of its inputs. ELIZA throws away each of its inputs, except for those few transformed by means of the MEMORY machinery. [EDITOR: A few inputs are saved in the MEMORY data structure so that things the user has mentioned earlier in the conversation can be revived when the dialog seems to have petered out.] Of course, the problem is more than one of storage. A great part of it is, in fact, subsumed under the word “selected” used just above. ELIZA in its use so far has had as one of its principal objectives the concealment of its lack of understanding. But to encourage its conversational partner to offer inputs from which it can select remedial information, it, must reveal its misunderstanding. A switch of objectives from the concealment to the revelation of misunderstanding is seen as a precondition to making an ELIZA-like program the basis for an effective natural language man–machine communication system.
因此,增强型 E LIZA程序的一个目标是建立一个系统,该系统已经能够访问有关现实世界某些方面的信息存储,并且通过与人的对话交互,可以揭示它所知道的内容,即表现得像信息检索系统,以及它的知识结束和需要扩充的地方。希望其知识的增加也将是其对话经验的直接结果。正是这样的程序将与许多人交谈并从他们每个人身上学到一些东西,这使得人们希望它能够成为一个有趣甚至有用的对话伙伴。
One goal for an augmented ELIZA program is thus a system which already has access to a store of information about some aspects of the real world and which, by means of conversational interaction with people, can reveal both what it knows, i.e., behave as an information retrieval system, and where its knowledge ends and needs to be augmented. Hopefully the augmentation of its knowledge will also be a direct consequence of its conversational experience. It is precisely the prospect that such a program will converse with many people and learn something from each of them, which leads to the hope that it will prove an interesting and even useful conversational partner.
表述略有不同的中间目标的一种方法是,应该赋予 E LIZA慢慢建立与之对话的主体模型的能力。例如,如果主题提到他没有结婚,然后又谈到他的妻子,那么 E LIZA应该能够初步推断出他要么是鳏夫,要么是离婚的。当然,他可能只是感到困惑。从长远来看,ELIZA应该能够建立主体的信念结构(用阿贝尔森的话来说),并在此基础上发现主体的合理化、矛盾等。与这样的ELIZA的对话常常会变成争论。为实现这些目标已经采取了重要步骤。其中最值得注意的是 Abelson 和 Carroll 在模拟信念结构方面的工作(Abelson 和 Carroll,1965)。
One way to state a slightly different intermediate goal is to say that ELIZA should be given the power to slowly build a model of the subject conversing with it. If the subject mentions that he is not married, for example, and later speaks of his wife, then ELIZA should be able make the tentative inference that he is either a widower or divorced. Of course, he could simply be confused. In the long run, ELIZA should be able to build up a belief structure (to use Abelson’s phrase) of the subject and on that basis detect the subject’s rationalizations, contradictions, etc. Conversations with such an ELIZA would often turn into arguments. Important steps in the realization of these goals have already been taken. Most notable among these is Abelson’s and Carroll’s work on simulation of belief structures (Abelson and Carroll, 1965).
构成大部分讨论基础的剧本恰好是一个具有压倒性心理倾向的剧本。其原因已经讨论过。然而,存在一个危险,即该示例可能会偏离其应说明的内容。记住 E LIZA程序本身只是技术编程意义上的翻译处理器是有用的。Gorn(1964)在一篇关于语言系统的论文中说:
The script that has formed the basis for most of this discussion happens to be one with an overwhelmingly psychological orientation. The reason for this has already been discussed. There is a danger, however, that the example will run away with what it is supposed to illustrate. It is useful to remember that the ELIZA program itself is merely a translating processor in the technical programming sense. Gorn (1964) in a paper on language systems says:
给定一种已经拥有语义内容的语言,那么一个翻译处理器,即使它只在语法上运行,也会生成另一种语言的相应表达,我们可以将其归因于“含义”(可能是多个——翻译者可能不是一对一的)生成源表达式的“语义意图”;当然,我们是否发现结果一致或有用或两者兼而有之,则是另一个问题。通过这种方法,很可能可以为每个表达式有效地为相同的语法对象语言分配多个含义。……
Given a language which already possesses semantic content, then a translating processor, even if it operates only syntactically, generates corresponding expressions of another language to which we can attribute as “meanings” (possibly multiple—the translator may not be one to one) the “semantic intents” of the generating source expressions; whether we find the result consistent or useful or both is, of course, another problem. It is quite possible that by this method the same syntactic object language can be usefully assigned multiple meanings for each expression. …
令人震惊的是,他的话与 E LIZA如此契合。“给定语言”是英语,“其他语言”也是英语,生成了“其他语言”的表达。原则上,给定的语言也可以是那种向高中生提供代数“文字问题”的英语,而另一种语言则是一种允许特定计算机“解决”所述问题的机器代码。(参见 Bobrow 的学生计划 [Bobrow, 1964]。)
It is striking to note how well his words fit ELIZA. The “given language” is English as is the “other,” expressions of which are generated. In principle, the given language could as well be the kind of English in which “word problems” in algebra are given to high school students and the other language, a machine code allowing a particular computer to “solve” the stated problems. (See Bobrow’s program STUDENT [Bobrow, 1964].)
上述言论的目的是为了进一步剥夺《E LIZA》的魔幻光环,而将其应用于心理主题在一定程度上促成了这种魔幻光环。最冷的时候见过可能的光,E LIZA是戈恩意义上的翻译处理器;然而,它是专门为与自然语言文本配合良好而构建的。
The intent of the above remarks is to further rob ELIZA of the aura of magic to which its application to psychological subject matter has to some extent contributed. Seen in the coldest possible light, ELIZA is a translating processor in Gorn’s sense; however, it is one which has been especially constructed to work well with natural language text.
经计算机协会许可,转载自 Weizenbaum (1966)。
Reprinted from Weizenbaum (1966), with permission from the Association for Computing Machinery.
有关作者的背景,请参阅第 26 章,该章展示了 Dijkstra 作为计算方面的数学和逻辑思想家的技能。但 Dijkstra 曾经是一名工程师、一名非常熟练的系统设计师和开发人员。这个成功的设计和开发项目不仅巩固了 Dijkstra 作为真正意义上的计算机科学家的声誉,而且还巩固了他作为计算机科学家的声誉。它激发了对可证明的、安全的、可靠的操作系统的大量学术研究。
For background on the author, see chapter 26, a demonstration of Dijkstra’s skill as a mathematical and logical thinker about computations. But Dijkstra was ever an engineer, a very skilled system designer and developer. This successful design and development project not only cemented Dijkstra’s reputation as a computer scientist in the fullest sense of the word; it stimulated a great deal of academic research into provable, secure, reliable operating systems.
附录介绍了信号量,这是一种非常重要的控制抽象,以适合一般推理的形式封装了各种特定于机器的技术,然后用于在并发程序中自动锁定和解锁资源。
The Appendix presents semaphores, a very important control abstraction, encapsulating in a form suitable for general reasoning a variety of machine-specific techniques then in use for atomically locking and unlocking resources in concurrent programs.
为了响应明确要求“及时研究和开发工作”论文的号召,我提交了一份关于埃因霍温科技大学数学系多道程序设计工作的进度报告。
IN response to a call explicitly asking for papers “on timely research and development efforts,” I present a progress report on the multiprogramming effort at the Department of Mathematics at the Technological University in Eindhoven.
我们的资源非常有限(即六个人的小组,平均只有一半的时间),并且希望为系统设计艺术做出贡献——包括概念、构建和验证的所有阶段,我们面临着以下问题:如何获得必要的经验的问题。为了解决这个问题,我们采取了以下三个指导原则:
Having very limited resources (viz. a group of six people of, on the average, half-time availability) and wishing to contribute to the art of system design—including all the stages of conception, construction, and verification, we were faced with the problem of how to get the necessary experience. To solve this problem we adopted the following three guiding principles:
1. 选择一个你能想到的尽可能先进的项目,尽可能雄心勃勃的项目,希望日常工作能够保持在最低限度;顶住所有压力来合并此类系统扩展,这只会导致待完成工作总量的纯粹数量增加。
1. Select a project as advanced as you can conceive, as ambitious as you can justify, in the hope that routine work can be kept to a minimum; hold out against all pressure to incorporate such system expansions that would only result into a purely quantitative increase of the total amount of work to be done.
2. 选择具有良好基本特性的机器(例如,令人爱上的中断系统肯定是一个鼓舞人心的特性);从那时起,请尽可能长时间地不要考虑您正在准备系统的配置的特定属性。
2. Select a machine with sound basic characteristics (e.g. an interrupt system to fall in love with is certainly an inspiring feature); from then on try to keep the specific properties of the configuration for which you are preparing the system out of your considerations as long as possible.
3. 要意识到,经验绝不会自动带来智慧和理解;换句话说,有意识地努力从以前的经历中尽可能多地学习。
3. Be aware of the fact that experience does by no means automatically lead to wisdom and understanding; in other words, make a conscious effort to learn as much as possible from your previous experiences.
因此,我将尝试不仅仅报告我们做了什么以及如何做,我还将尝试阐述我们所学到的东西。
Accordingly, I shall try to go beyond just reporting what we have done and how, and I shall try to formulate as well what we have learned.
我想以两段关于工作条件的简短评论来结束介绍,这是为了完整起见。我将不再强调这些要点。
I should like to end the introduction with two short remarks on working conditions, which I make for the sake of completeness. I shall not stress these points any further.
一种说法是,如果一个人与同时承担其他义务的半职人员一起工作,那么生产速度就会严重减慢。这至少是四的因数;也许情况更糟。人们自己在转换中浪费了时间和精力;整个团体会失去决策速度,因为在需要时,讨论往往要推迟到所有相关人员都可以参加时才进行。
One remark is that production speed is severely slowed down if one works with half-time people who have other obligations as well. This is at least a factor of four; probably it is worse. The people themselves lose time and energy in switching over; the group as a whole loses decision speed as discussions, when needed, have often to be postponed until all people concerned are available.
另外一点是,该小组的成员(大部分是数学家)都曾作为优秀学生接受过五到八年的大学培训,并且是硕士或博士。等级。我明确提及这一点是因为至少在我国,系统设计所需的智力水平普遍被严重低估。我比以往任何时候都更加确信,这类工作非常困难,与最优秀的人以外的人一起完成这项工作的每一次努力都注定要么失败,要么付出巨大的代价取得一定的成功。
The other remark is that the members of the group (mostly mathematicians) have previously enjoyed as good students a university training of five to eight years and are of Master’s or Ph.D. level. I mention this explicitly because at least in my country the intellectual level needed for system design is in general grossly underestimated. I am convinced more than ever that this type of work is very difficult, and that every effort to do it with other than the best people is doomed to either failure or moderate success at enormous expense.
该系统是为荷兰机器 EL X8(NV Electrologica,Rijswijk (ZH))设计的。我们的配置的特点是:
The system has been designed for a Dutch machine, the EL X8 (N.V. Electrologica, Rijswijk (ZH)). Characteristics of our configuration are:
1. 核心内存周期时间2.5μ秒,27位;目前32K;
1. core memory cycle time 2.5 μsec, 27 bits; at present 32K;
2. 512K 字鼓,每轨 1024 字,rev。时间 40 毫秒;
2. drum of 512K words, 1024 words per track, rev. time 40 msec;
3.非常适合堆栈实现的间接寻址机制;
3. an indirect addressing mechanism very well suited for stack implementation;
4. 完善的外设命令和中断控制系统;
4. a sound system for commanding peripherals and controlling of interrupts;
5. 潜在大量低容量信道;使用其中 10 台(3 个 1000 字符/秒的纸带阅读器;3 个 150 字符/秒的纸带打孔机;2 台电传打印机;一台绘图仪;一台行式打印机);
5. a potentially great number of low capacity channels; ten of them are used (3 paper tape readers at 1000 char/sec; 3 paper tape punches at 150 char/sec; 2 teleprinters; a plotter; a line printer);
6.缺少一些不寻常的、尴尬的特征。
6. absence of a number of not unusual, awkward features.
该系统的主要目标是平稳地处理连续的用户程序流,作为大学的服务。选择多道程序设计系统时考虑到以下目标:(1) 减少短期程序的周转时间,(2) 外围设备的经济使用,(3) 与后备存储相结合的自动控制中央处理器的经济使用,以及(4)将机器用于仅需要通用计算机的灵活性但(通常)不需要容量或处理能力的应用的经济可行性。
The primary goal of the system is to process smoothly a continuous flow of user programs as a service to the University. A multiprogramming system has been chosen with the following objectives in mind: (1) a reduction of turn-around time for programs of short duration, (2) economic use of peripheral devices, (3) automatic control of backing store to be combined with economic use of the central processor, and (4) the economic feasibility to use the machine for those applications for which only the flexibility of a general purpose computer is needed, but (as a rule) not the capacity nor the processing power.
该系统并非旨在作为多路访问系统。不存在可供独立用户相互通信的通用数据库:他们仅共享配置和信息过程库(包括用复数扩展的LGOL 60 的转换器)。该系统不适合用机器语言编写的用户程序。……
The system is not intended as a multiaccess system. There is no common data base via which independent users can communicate with each other: they only share the configuration and a procedure library (that includes a translator for ALGOL 60 extended with complex numbers). The system does not cater for user programs written in machine language. …
我们犯了一些常见的小错误(例如过于关注消除不是真正瓶颈的内容)和两个重大错误。
We have made some minor mistakes of the usual type (such as paying too much attention to eliminating what was not the real bottleneck) and two major ones.
我们的第一个重大错误是,在很长一段时间内,我们将注意力局限于“完美的安装”;当我们考虑如何充分利用它时,其中一个外围设备发生了故障,我们面临着棘手的问题。处理“病态”比我们预想的要花费更多的精力,而我们的一些麻烦是我们早期聪明才智的直接后果,即系统本可以自行应对的情况的复杂性。如果我们在设计的早期阶段就注意到病态,我们的管理规则肯定不会那么细化。
Our first major mistake was that for too long a time we confined our attention to “a perfect installation”; by the time we considered how to make the best of it, one of the peripherals broke down, we were faced with nasty problems. Taking care of the “pathology” took more energy than we had expected, and some of our troubles were a direct consequence of our earlier ingenuity, i.e., the complexity of the situation into which the system could have maneuvered itself. Had we paid attention to the pathology at an earlier stage of the design, our management rules would certainly have been less refined.
第二个主要错误是我们在构思和编程系统的主要部分时没有充分考虑调试问题。我必须否认这一错误并没有造成严重后果这一事实——相反!人们可能会事后争论。
The second major mistake has been that we conceived and programmed the major part of the system without giving more than scanty thought to the problem of debugging it. I must decline all credit for the fact that this mistake had no serious consequences—on the contrary! one might argue as an afterthought.
作为船员的队长,我在制作处理实时中断的基本软件方面拥有丰富的经验(可以追溯到 1958 年),并且我通过痛苦的经验知道,由于中断时刻的不可重复性,程序错误可能会出现就像机器偶尔发生故障一样具有误导性。结果我就害怕极了。由于担心调试的可能性,我们决定尽可能小心,预防胜于治疗,尽量防止讨厌的错误进入结构。
As captain of the crew I had had extensive experience (dating back to 1958) in making basic software dealing with real-time interrupts, and I knew by bitter experience that as a result of the irreproducibility of the interrupt moments a program error could present itself misleadingly like an occasional machine malfunctioning. As a result I was terribly afraid. Having fears regarding the possibility of debugging, we decided to be as careful as possible and, prevention being better than cure, to try to prevent nasty bugs from entering the construction.
这个受到恐惧启发的决定是我认为该小组对系统设计艺术的主要贡献的基础。我们发现,设计一个精致的多道程序设计系统是可能的,其逻辑健全性可以被先验地证明,并且其实现可以接受详尽的测试。测试期间出现的唯一错误是微不足道的编码错误(每 500 条指令发生 1 个错误的密度),每个错误都可以在机器 10 分钟(传统)检查内找到,并且每个错误都相应地易于纠正。在撰写本文时,测试尚未完成,但最终的系统保证是完美的。当系统交付时,我们不会永远担心在不太可能的情况下仍然会发生系统脱轨,例如可能是由于两个或多个关键事件的不幸“巧合”造成的,因为我们将证明以下内容的正确性:该系统具有严格性和明确性,这对于绝大多数数学证明来说是不寻常的。
This decision, inspired by fear, is at the bottom of what I regard as the group’s main contribution to the art of system design. We have found that it is possible to design a refined multiprogramming system in such a way that its logical soundness can be proved a priori and its implementation can admit exhaustive testing. The only errors that showed up during testing were trivial coding errors (occurring with a density of one error per 500 instructions), each of them located within 10 minutes (classical) inspection by the machine and each of them correspondingly easy to remedy. At the time this was written the testing had not yet been completed, but the resulting system is guaranteed to be flawless. When the system is delivered we shall not live in the perpetual fear that a system derailment may still occur in an unlikely situation, such as might result from an unhappy “coincidence” of two or more critical occurrences, for we shall have proved the correctness of the system with a rigor and explicitness that is unusual for the great majority of mathematical proofs.
我们采取了另一种方法,事实证明,这种方法取得了巨大的优势。在我们的术语中,我们严格区分了内存单元(我们称之为“页面”,有“核心页面”和“鼓页面”)和相应的信息单元(由于缺乏更好的词,我们称之为“段”),片段恰好适合页面。对于段,我们创建了完全独立的标识机制,其中可能的段标识符的数量远大于主存储和辅助存储中的页面总数。段标识符可以快速访问核心中所谓的“段变量”,其值表示该段是否仍为空,如果不为空,则可以在哪个页面(或多个页面)中找到该段。
We have followed another approach and, as it turned out, to great advantage. In our terminology we made a strict distinction between memory units (we called them “pages” and had “core pages” and “drum pages”) and corresponding information units (for lack of a better word we called them “segments”), a segment just fitting in a page. For segments we created a completely independent identification mechanism in which the number of possible segment identifiers is much larger than the total number of pages in primary and secondary store. The segment identifier gives fast access to a so-called “segment variable” in core whose value denotes whether the segment is still empty or not, and if not empty, in which page (or pages) it can be found.
由于这种方法,如果必须将驻留在核心页中的信息段转储到鼓上以使核心页可供其他使用,则无需将该段返回到同一鼓它最初来自的页面。事实上,这种自由度被利用了:在空闲的鼓页面中,选择等待时间最短的鼓页面。下一个结果是完全不存在鼓分配问题:没有任何理由说明程序应该占用连续的鼓页面。在多道程序设计环境中,这非常方便。
As a consequence of this approach, if a segment of information, residing in a core page, has to be dumped onto the drum in order to make the core page available for other use, there is no need to return the segment to the same drum page from which it originally came. In fact, this freedom is exploited: among the free drum pages the one with minimum latency time is selected. A next consequence is the total absence of a drum allocation problem: there is not the slightest reason why, say, a program should occupy consecutive drum pages. In a multiprogramming environment this is very convenient.
这使我们能够根据这些抽象的“顺序过程”来设计整个系统。它们的和谐合作是通过明确的相互同步声明来调节的。一方面,这种明确的相互同步是必要的,因为我们不对速度比做出任何假设;另一方面,这种相互同步是可能的,因为“暂时延迟进程的进度”永远不会损害被延迟进程的内部逻辑。这种方法的根本后果是。显式相互同步——可以通过离散推理建立一组这样的顺序过程的和谐合作;作为进一步的结果,协作顺序进程的整个和谐社会独立于可用于执行这些进程的处理器的实际数量,只要可用的处理器可以在进程之间切换。
This enabled us to design the whole system in terms of these abstract “sequential processes.” Their harmonious cooperation is regulated by means of explicit mutual synchronization statements. On the one hand, this explicit mutual synchronization is necessary, as we do not make any assumption about speed ratios; on the other hand, this mutual synchronization is possible because “delaying the progress of a process temporarily” can never be harmful to the interior logic of the process delayed. The fundamental consequence of this approach—viz. the explicit mutual synchronization—is that the harmonious cooperation of a set of such sequential processes can be established by discrete reasoning; as a further consequence the whole harmonious society of cooperating sequential processes is independent of the actual number of processors available to carry out these processes, provided the processors available can switch from process to process.
在级别 0,我们发现负责将处理器分配给其动态进程在逻辑上是允许的进程之一(即考虑到显式相互同步)。在此级别,处理并引入实时时钟的中断,以防止任何进程独占处理能力。在这个级别,合并了优先级规则,以实现系统在需要时的快速响应。我们的第一个抽象已经实现了;在 0 级以上,实际共享的处理器数量不再相关。在更高的层次上,我们发现不同顺序进程的活动,失去其身份的实际处理器已从图片中消失。
At level 0 we find the responsibility for processor allocation to one of the processes whose dynamic progress is logically permissible (i.e. in view of the explicit mutual synchronization). At this level the interrupt of the real-time clock is processed and introduced to prevent any process to monopolize processing power. At this level a priority rule is incorporated to achieve quick response of the system where this is needed. Our first abstraction has been achieved; above level 0 the number of processors actually shared is no longer relevant. At higher levels we find the activity of the different sequential processes, the actual processor that had lost its identity having disappeared from the picture.
在第 1 级,我们有所谓的“段控制器”,这是一个与鼓中断和更高级别的顺序过程同步的顺序过程。在第 1 级,我们有责任满足自动后备存储所产生的簿记需求。在这个级别我们已经实现了下一个抽象;在所有更高级别的信息识别都是按照片段进行的,失去其身份的实际存储页面已从图片中消失。
At level 1 we have the so-called “segment controller,” a sequential process synchronized with respect to the drum interrupt and the sequential processes on higher levels. At level 1 we find the responsibility to cater to the bookkeeping resulting from the automatic backing store. At this level our next abstraction has been achieved; at all higher levels identification of information takes place in terms of segments, the actual storage pages that had lost their identity having disappeared from the picture.
在第 2 级,我们发现“消息解释器”负责控制台键盘的分配,通过该控制台键盘可以在操作员和任何更高级别的进程之间进行对话。消息解释器与操作员紧密同步工作。当操作员按下一个键时,一个字符与一个中断信号一起发送到机器,以宣布下一个键盘字符,而实际的打印是通过机器在消息解释器的控制下生成的输出命令来完成的。(就硬件而言,控制台电传打印机被视为两个独立的外围设备:输入键盘和输出打印机。)如果其中一个进程打开一个对话,它会在对话的开头句中标识自己,以便于运营商。然而,如果操作员打开对话,则他必须在对话的开头句中识别他正在寻址的进程,即,必须先解释该开头句,然后才能知道对话针对哪个进程!这就是为控制台电传打字机引入单独的顺序过程的逻辑原因,这一原因反映在其名称“消息解释器”中。
At level 2 we find the “message interpreter” taking care of the allocation of the console keyboard via which conversations between the operator and any of the higher level processes can be carried out. The message interpreter works in close synchronism with the operator. When the operator presses a key, a character is sent to the machine together with an interrupt signal to announce the next keyboard character, whereas the actual printing is done through an output command generated by the machine under control of the message interpreter. (As far as the hardware is concerned the console teleprinter is regarded as two independent peripherals: an input keyboard and an output printer.) If one of the processes opens a conversation, it identifies itself in the opening sentence of the conversation for the benefit of the operator. If, however, the operator opens a conversation, he must identify the process he is addressing, in the opening sentence of the conversation, i.e., this opening sentence must be interpreted before it is known to which of the processes the conversation is addressed! Here lies the logical reason for the introduction of a separate sequential process for the console teleprinter, a reason that is reflected in its name, “message interpreter.”
在第 2 级之上,就好像每个进程都有其私人对话控制台。他们共享相同的物理控制台这一事实被转化为“一次只能进行一个对话”形式的资源限制,这种限制可以通过相互同步来满足。在这个级别,下一个抽象已经实现;在更高的级别上,实际的控制台电传打印机失去了它的身份。(如果消息解释器的级别不高于段控制器,那么实现它的唯一方法就是在核心中为其进行永久保留;因为会话词汇量可能会变大(一旦我们的操作员希望在奇特的消息中得到解决),这将导致对核心存储的永久需求过大。因此,表达消息的词汇表存储在段,即也可以驻留在鼓上的信息单元。因此,消息解释器比段控制器高一级。)
Above level 2 it is as if each process had its private conversational console. The fact that they share the same physical console is translated into a resource restriction of the form “only one conversation at a time,” a restriction that is satisfied via mutual synchronization. At this level the next abstraction has been implemented; at higher levels the actual console teleprinter loses its identity. (If the message interpreter had not been on a higher level than the segment controller, then the only way to implement it would have been to make a permanent reservation in core for it; as the conversational vocabulary might become large (as soon as our operators wish to be addressed in fancy messages), this would result in too heavy a permanent demand upon core storage. Therefore, the vocabulary in which the messages are expressed is stored on segments, i.e., as information units that can reside on the drum as well. For this reason the message interpreter is one level higher than the segment controller.)
在第 3 级,我们发现与输入流缓冲和输出流取消缓冲相关的顺序过程。在此级别上进行下一个抽象,即。在此级别分配给“逻辑通信单元”的实际使用的外围设备的抽象,“逻辑通信单元”在更高级别上工作。与外围设备相关的顺序处理的级别高于消息解释器,因为它们必须能够与操作员对话(例如,在检测到故障的情况下)。外围设备的有限数量再次成为较高级别进程的资源限制,需要通过它们之间的相互同步来满足。
At level 3 we find the sequential processes associated with buffering of input streams and unbuffering of output streams. At this level the next abstraction is effected, viz. the abstraction of the actual peripherals used that are allocated at this level to the “logical communication units” in terms of which are worked in the still higher levels. The sequential processes associated with the peripherals are of a level above the message interpreter, because they must be able to converse with the operator (e.g. in the case of detected malfunctioning). The limited number of peripherals again acts as a resource restriction for the processes at higher levels to be satisfied by mutual synchronization between them.
在第 4 层,我们找到独立的用户程序,在第 5 层,我们找到操作员(不是我们实现的)。
At level 4 we find the independent user programs and at level 5 the operator (not implemented by us).
为了使下一节更容易理解,已经详细描述了系统结构。
The system structure has been described at length in order to make the next section intelligible.
构思阶段花了很长时间。在那段时间里,我们在上一节中概述了系统的概念已经诞生。此外,我们还学会了推理的艺术,通过这种艺术,我们可以从我们的需求中推断出流程应通过相互同步而相互影响的方式,以便满足这些需求。(要求是信息在产生之前不能使用,任何外围设备不能同时设置两个任务,等等。)最后,我们学会了推理的艺术,通过它我们可以证明社会是由相互的过程组成的。彼此同步确实会在其时间行为满足所有要求。
The conception stage took a long time. During that period of time the concepts have been born in terms of which we sketched the system in the previous section. Furthermore, we learned the art of reasoning by which we could deduce from our requirements the way in which the processes should influence each other by their mutual synchronization so that these requirements would be met. (The requirements being that no information can be used before it has been produced, that no peripheral can be set to two tasks simultaneously, etc.) Finally we learned the art of reasoning by which we could prove that the society composed of processes thus mutually synchronized by each other would indeed in its time behavior satisfy all requirements.
构建阶段相当传统,甚至可能是老式的,即纯机器代码。由于规格变化而重新编程的情况很少见,这种情况一定对“蒸汽方法”的可行性做出了很大贡献。前两个阶段花费的时间比计划的要长,但机器交付的延迟在一定程度上弥补了这一不足。
The construction stage has been rather traditional, perhaps even old-fashioned, that is, plain machine code. Reprogramming on account of a change of specifications has been rare, a circumstance that must have contributed greatly to the feasibility of the “steam method.” That the first two stages took more time than planned was somewhat compensated by a delay in the delivery of the machine.
在验证阶段,我们在短射期间完全可以使用机器;这些是我们在没有任何软件辅助调试的原始机器上工作的镜头。从级别 0 开始对系统进行测试,每次仅在前一个级别经过彻底测试后才添加(一部分)下一个级别。每个测试镜头本身在要测试的(部分)系统之上包含多个具有双重功能的测试过程。首先,他们必须迫使系统进入所有不同的相关状态;其次,他们必须验证系统是否继续按照规范做出反应。
In the verification stage we had the machine, during short shots, completely at our disposal; these were shots during which we worked with a virgin machine without any software aids for debugging. Starting at level 0 the system was tested, each time adding (a portion of) the next level only after the previous level had been thoroughly tested. Each test shot itself contained, on top of the (partial) system to be tested, a number of testing processes with a double function. First, they had to force the system into all different relevant states; second, they had to verify that the system continued to react according to specification.
我不会否认这些测试程序的构建是一项重大的智力工作:要说服自己没有忽视“相关状态”,并说服自己测试程序生成了所有这些都不是一件简单的事情。令人鼓舞的是(据我们所知!)这是可以做到的。
I shall not deny that the construction of these testing programs has been a major intellectual effort: to convince oneself that one has not overlooked “a relevant state” and to convince oneself that the testing programs generate them all is no simple matter. The encouraging thing is that (as far as we know!) it could be done.
这一事实是等级结构带来的令人高兴的结果之一。
This fact was one of the happy consequences of the hierarchical structure.
测试级别 0(实时时钟和处理器分配)意味着在其之上进行多个测试顺序进程,一起检查在所有情况下处理器时间是否按照规则在它们之间分配。建立此后,就实施了顺序流程。
Testing level 0 (the real-time clock and processor allocation) implied a number of testing sequential processes on top of it, inspecting together that under all circumstances processor time was divided among them according to the rules. This being established, sequential processes as such were implemented.
在第 1 级测试段控制器意味着所有“相关状态”都可以根据对核心页面提出(以各种组合)要求的顺序进程来制定,这些情况可能由测试程序之间的显式同步引发。在这个阶段,实时时钟的存在(尽管一直在中断)是如此无关紧要,以至于其中一位测试人员确实忘记了它的存在!
Testing the segment controller at level 1 meant that all “relevant states” could be formulated in terms of sequential processes making (in various combinations) demands on core pages, situations that could be provoked by explicit synchronization among the testing programs. At this stage the existence of the real-time clock—although interrupting all the time—was so immaterial that one of the testers indeed forgot its existence!
到那时,我们已经对来自实时时钟和鼓的(相互不同步的)中断实施了正确的反应。如果我们没有引入单独的级别 0 和 1,并且如果我们没有创建一个术语(即相当抽象的顺序过程的术语),其中时钟中断的存在可以被丢弃,而是尝试在非分层中为了使中央处理器直接对这两个中断的任何奇怪的时间连续做出反应,“相关状态”的数量将爆炸到如此高的程度,以至于详尽的测试将成为一种幻觉。(除此之外,我们是否有办法生成所有这些都是值得怀疑的,鼓和时钟速度超出了我们的控制范围。)……
By that time we had implemented the correct reaction upon the (mutually unsynchronized) interrupts from the real-time clock and the drum. If we had not introduced the separate levels 0 and 1, and if we had not created a terminology (viz. that of the rather abstract sequential processes) in which the existence of the clock interrupt could be discarded, but had instead tried in a nonhierarchical construction, to make the central processor react directly upon any weird time succession of these two interrupts, the number of “relevant states” would have exploded to such a height that exhaustive testing would have been an illusion. (Apart from that it is doubtful whether we would have had the means to generate them all, drum and clock speed being outside our control.) …
就程序验证而言,我没有提出任何本质上新的内容。在测试一个通用对象(无论是一个硬件、一个程序、一台机器还是一个系统)时,我们不能让它经历所有可能的情况:对于一台计算机来说,这意味着我们向它提供所有可能的程序!因此,必须使用一组相关的测试用例对其进行测试。只要将机制视为黑匣子,就无法确定什么是相关的,什么是不相关的;换句话说,决策必须基于待测试机制的内部结构。设计者似乎有责任以这样一种方式构建他的机制——即如此有效地构建——在测试过程的每个阶段,相关测试用例的数量将如此之小,以至于他可以尝试所有这些用例,并且正在执行的操作考验将是那么明显,他不会忽视任何情况。我对我们的系统进行了一项调查,因为我认为它是这种结构可能采取的形式的一个很好的例子。
As far as program verification is concerned I present nothing essentially new. In testing a general purpose object (be it a piece of hardware, a program, a machine, or a system), one cannot subject it to all possible cases: for a computer this would imply that one feeds it with all possible programs! Therefore one must test it with a set of relevant test cases. What is, or is not, relevant cannot be decided as long as one regards the mechanism as a black box; in other words, the decision has to be based upon the internal structure of the mechanism to be tested. It seems to be the designer’s responsibility to construct his mechanism in such a way—i.e. so effectively structured—that at each stage of the testing procedure the number of relevant test cases will be so small that he can try them all and that what is being tested will be so perspicuous that he will not have overlooked any situation. I have presented a survey of our system because I think it a nice example of the form that such a structure might take.
根据我的经验,我很抱歉地说,工业软件制造商往往会对系统做出复杂的反应。一方面,他们倾向于认为我们做了一种模范工作;另一方面,他们倾向于认为我们做得很好。另一方面,他们对所使用的技术是否适用于大学的庇护氛围之外表示怀疑,并表示我们之所以成功只是因为整个项目的规模不大。我无意低估处理更大工作和更多人员所需的组织能力,但我想冒险认为项目越大,结构就越重要!等级制度五个逻辑级别很可能会被证明是适度的深度,特别是当人们比我们更有意识地设计系统时,其目的是使软件能够平滑地适应(也许是剧烈的)配置扩展。……
In my experience, I am sorry to say, industrial software makers tend to react to the system with mixed feelings. On the one hand, they are inclined to think that we have done a kind of model job; on the other hand, they express doubts whether the techniques used are applicable outside the sheltered atmosphere of a University and express the opinion that we were successful only because of the modest scope of the whole project. It is not my intention to underestimate the organizing ability needed to handle a much bigger job, with a lot more people, but I should like to venture the opinion that the larger the project, the more essential the structuring! A hierarchy of five logical levels might then very well turn out to be of modest depth, especially when one designs the system more consciously than we have done, with the aim that the software can be smoothly adapted to (perhaps drastic) configuration expansions. …
进程“ Q ”执行操作“ P (sem)”,将名为“sem”的信号量值减 1。如果相关信号量的结果值为非负,则进程Q可以继续执行它的下一个声明;然而,如果结果值为负,则进程Q被停止并被记录在与相关信号量相关的等待列表中。在进一步通知之前(即对同一信号量进行V操作),进程Q的动态进程在逻辑上是不允许的,并且不会向其分配任何处理器(参见上面的“系统层次结构”,级别 0)。
A process, “Q” say, that performs the operation “P(sem)” decreases the value of the semaphore called “sem” by 1. If the resulting value of the semaphore concerned is nonnegative, process Q can continue with the execution of its next statement; if, however, the resulting value is negative, process Q is stopped and booked on a waiting list associated with the semaphore concerned. Until further notice (i.e. a V-operation on this very same semaphore), dynamic progress of process Q is not logically permissible and no processor will be allocated to it (see above “System Hierarchy,” at level 0).
执行操作“ V (sem)”的进程“ R ”将称为“sem”的信号量的值增加 1。如果相关信号量的结果值为正,则所讨论的V操作没有进一步的效果;然而,如果相关信号量的结果值为非正数,则在其等待列表上预订的进程之一将从该等待列表中删除,即,其动态进程在逻辑上再次是允许的,并且在适当的时候将向其分配处理器(再次参见上面的“系统层次结构”,第 0 级)。
A process, “R” say, that performs the operation “V(sem)” increases the value of the semaphore called “sem” by 1. If the resulting value of the semaphore concerned is positive, the V-operation in question has no further effect; if, however, the resulting value of the semaphore concerned is nonpositive, one of the processes booked on its waiting list is removed from this waiting list, i.e. its dynamic progress is again logically permissible and in due time a processor will be allocated to it (again, see above “System Hierarchy,” at level 0).
推论1.如果信号量值为非正数,则其绝对值等于其等待列表中预订的进程数。
COROLLARY 1. If a semaphore value is nonpositive its absolute value equals the number of processes booked on its waiting list.
推论2. P操作代表潜在的延迟,互补的 V 操作代表势垒的移除。
COROLLARY 2. The P-operation represents the potential delay, the complementary V-operation represents the removal of a barrier.
注1:P-和V-操作是“不可分割的动作”;即,如果它们在并行过程中“同时”发生,则它们是互不干扰的,因为它们可以被视为一个接一个地执行。
Note 1. P- and V-operations are “indivisible actions”; i.e. if they occur “simultaneously” in parallel processes they are noninterfering in the sense that they can be regarded as being performed one after the other.
注2:如果V操作产生的信号量值为负,则其等待列表原本包含多个进程。哪个等待进程将从等待列表中删除是未定义的(即逻辑上无关紧要)。
Note 2. If the semaphore value resulting from a V-operation is negative, its waiting list originally contained more than one process. It is undefined—i.e, logically immaterial—which of the waiting processes is then removed from the waiting list.
注3:上述机制的结果是,允许动态进展的进程只能通过实际进展(即,通过对初始值为非正值的信号量执行P操作)来失去此状态。
Note 3. A consequence of the mechanisms described above is that a process whose dynamic progress is permissible can only lose this status by actually progressing, i.e., by performance of a P-operation on a semaphore with a value that is initially nonpositive.
在系统构想期间,我们发现我们以两种完全不同的方式使用信号量。差异是如此明显,以至于回顾过去,人们怀疑这是否真的公平将这两种方法呈现为使用完全相同的原语。一方面,我们有用于互斥的信号量,另一方面,我们有私有信号量。
During system conception it transpired that we used the semaphores in two completely different ways. The difference is so marked that, looking back, one wonders whether it was really fair to present the two ways as uses of the very same primitives. On the one hand, we have the semaphores used for mutual exclusion, on the other hand, the private semaphores.
由于对“互斥体”进行P和V操作,标记为“临界区”的操作在时间上相互排斥;给出的方案允许直接扩展到两个以上的并行进程,互斥量的最大值等于l,如果我们有n个并行进程,则最小值等于−( n −1) 。
As a result of the P- and V-operations on “mutex” the actions, marked as “critical sections” exclude each other mutually in time; the scheme given allows straightforward extension to more than two parallel processes, the maximum value of mutex equals l, the minimum value equals − (n − 1) if we have n parallel processes.
总是使用临界区,并且仅用于明确检查和修改描述系统当前状态的状态变量(分配在周围的宇宙中)(只要调节各个系统之间的和谐合作所需)。过程)。
Critical sections are used always, and only for the purpose of unambiguous inspection and modification of the state variables (allocated in the surrounding universe) that describe the current state of the system (as far as needed for the regulation of the harmonious cooperation between the various processes).
每当流程达到动态进度的权限取决于状态变量的当前值的阶段时,它就会遵循以下模式:
Whenever a process reaches a stage where the permission for dynamic progress depends on current values of state variables, it follows the pattern:
P(互斥体);
P(mutex);
“状态变量的检查和修改
“inspection and modification of state variables
包括条件V(私有信号量)”;
including a conditional V(private semaphore)”;
V(互斥体);
V(mutex);
P(私有信号量)。
P(private semaphore).
如果检查发现有问题的进程应该继续,它会执行操作“ V(私有信号量)”——信号量值然后从 0 变为 1——否则,该V操作将被跳过,将义务留给其他进程在适当的时刻执行此V操作。该义务的存在或缺失反映在离开临界区时状态变量的最终值中。
If the inspection learns that the process in question should continue, it performs the operation “V(private semaphore)”—the semaphore value then changes from 0 to 1—otherwise, this V-operation is skipped, leaving to the other processes the obligation to perform this V-operation at a suitable moment. The absence or presence of this obligation is reflected in the final values of the state variables upon leaving the critical section.
每当一个进程达到一个阶段,由于其进展,可能一个(或多个)被阻止的进程现在应该获得继续的许可,它遵循以下模式:
Whenever a process reaches a stage where as a result of its progress possibly one (or more) blocked processes should now get permission to continue, it follows the pattern:
P(互斥体);
P(mutex);
“状态变量的修改和检查,包括对其他进程的私有信号量进行零个或多个V操作”;
“modification and inspection of state variables including zero or more V-operations on private semaphores of other processes”;
V(互斥体)。
V(mutex).
通过引入合适的状态变量和对关键部分进行适当的编程,可以实现分配外设、缓冲区等的任何策略。
By the introduction of suitable state variables and appropriate programming of the critical sections any strategy assigning peripherals, buffer areas, etc. can be implemented.
通过观察,在上面概述的两个互补的关键部分中,可以通过引入“不稳定情况”的概念来执行相同的检查,例如免费阅读器和需要的过程,从而大大减少编码和推理的数量一位读者。每当出现不稳定情况时,它就会在创建它的关键部分中被删除(包括对私有信号量的一个或多个V操作)。
The amount of coding and reasoning can be greatly reduced by the observation that in the two complementary critical sections sketched above the same inspection can be performed by the introduction of the notion of “an unstable situation,” such as a free reader and a process needing a reader. Whenever an unstable situation emerges it is removed (including one or more V-operations on private semaphores) in the very same critical section in which it has been created.
当循环进程离开其起始位置时,“它接受任务”;当任务完成且不更早时,过程返回到其起始位置。每个循环进程都有特定的任务处理能力(例如,执行用户程序或取消缓冲打印机输出的一部分等)。
When a cyclic process leaves its homing position “it accepts a task”; when the task has been performed and not earlier, the process returns to its homing position. Each cyclic process has a specific task processing power (e.g. the execution of a user program or unbuffering a portion of printer output, etc.).
和谐合作主要体现在大致三个阶段。
The harmonious cooperation is mainly proved in roughly three stages.
1. 事实证明,虽然执行任务的进程可以为其他进程生成有限数量的任务,但单个初始任务不能产生无限数量的任务生成。证明很简单,因为进程只能为层次结构较低级别的进程生成任务,因此排除了循环性。(如果需要来自鼓的段的进程已为段控制器生成了任务,则已采取特殊的预防措施来确保所请求的段至少保留在核心中,直到请求进程有效地访问了相关段。如果没有此预防措施有限的任务可能被迫为段控制器生成无限数量的任务,并且系统可能会陷入无效的页面抖动。)
1. It is proved that although a process performing a task may in so doing generate a finite number of tasks for other processes, a single initial task cannot give rise to an infinite number of task generations. The proof is simple as processes can only generate tasks for processes at lower levels of the hierarchy so that circularity is excluded. (If a process needing a segment from the drum has generated a task for the segment controller, special precautions have been taken to ensure that the segment asked for remains in core at least until the requesting process has effectively accessed the segment concerned. Without this precaution finite tasks could be forced to generate an infinite number of tasks for the segment controller, and the system could get stuck in an unproductive page flutter.)
2. 事实证明,不可能所有进程都已返回其归位位置,而系统中的某个位置仍然有待处理的已生成但未接受的任务。(这通过刚才描述的情况的不稳定性得到了证明。)
2. It is proved that it is impossible that all processes have returned to their homing position while somewhere in the system there is still pending a generated but unaccepted task. (This is proved via instability of the situation just described.)
3. 事实证明,在接受初始任务后,所有进程最终将(再次)处于其归位位置。在任务执行过程中被阻塞的每个进程都依赖于其他进程来消除障碍。本质上,所讨论的证明是不存在“循环等待”的证明:进程P等待进程Q等待进程R等待进程P。(我们通常对循环等待的术语是“致命的拥抱”。)……
3. It is proved that after the acceptance of an initial task all processes eventually will be (again) in their homing position. Each process blocked in the course of task execution relies on the other processes for removal of the barrier. Essentially, the proof in question is a demonstration of the absence of “circular waits”: process P waiting for process Q waiting for process R waiting for process P. (Our usual term for the circular wait is “the Deadly Embrace.”) …
经计算机协会许可,转载自 Dijkstra (1968b)。
Reprinted from Dijkstra (1968b), with permission from the Association for Computing Machinery.
“结构化编程”的发展并不是一下子发生的。它随着软件行业的成熟而发展:编译器变得更好,因此程序员不太愿意为了节省一些计算步骤或几个字节的程序内存而编写棘手的代码;系统变得越来越大,因此团队更有动力编写具有易于解释的模块化结构的可理解代码。早期的处理器缺乏堆栈操作。到 1968 年,递归不再是外来事物,并且在体系结构上得到了支持,像while这样的循环结构在编程语言中变得很常见(这篇论文发表于 Wirth 设计 P ASCAL语言时)。
The movement toward “structured programming” did not happen all at once. It evolved as the software industry matured: compilers got better, so programmers were less motivated to write tricky code in order to save a few compute steps or a few bytes of program memory; and systems got bigger, so teams were more motivated to write comprehensible code with easily explained modular structure. Early processors lacked stack operations. By 1968 recursion was no longer exotic and was architecturally supported, and looping structures like while were becoming common in programming languages (this paper was published about the time Wirth was designing the PASCAL language).
因此,正如 Edsger Dijkstra(见第 26 章)所承认的那样,这篇文章实际上只是一封“致编辑的信”,与其说是科学贡献,不如说是对正在进行的运动的总结陈述,并呼吁采取行动。尽管如此,这篇文章对编程实践产生了深远的影响——“结构化编程”一词在接下来的十年中出现在数百篇文章和书籍中。与其他对正统观念的呼吁一样,迪杰斯特拉的信也引起了一些反对,或者至少呼吁保持灵活性。Donald Knuth (1974a) 记录了有些事情确实更容易用go to 来表达,并建议结构化编程可以与go to语句共存。但总的来说,Dijkstra 的论点赢得了胜利,因为它改变了编程教学方式,并为编程语言的语言结构设定了最低条件。
So as Edsger Dijkstra (see chapter 26) acknowledges, this article—actually a mere “Letter to the Editor”—is less a scientific contribution than a summary statement of an ongoing movement, accompanied by a call to action. Be that as it may, the piece had a profound effect on programming practice—the words “structured programming” appeared in hundreds of article and book titles over the next decade. Like other calls to orthodoxy, Dijkstra’s letter drew some opposition, or at least pleas for flexibility. Donald Knuth (1974a) documented that some things really were easier to say with go tos and suggested that structured programming could coexist with go to statements. But by and large, Dijkstra’s argument won the day, by shifting the way programming was taught and by setting minimum conditions on the linguistic structures in programming languages.
一条文字注释:现在被计算机科学家广泛模仿的“被认为有害”一词并非 Dijkstra 的。编辑 Niklaus Wirth 在决定将这篇文章(最初的标题是“反对 GOTO 语句的案例”)作为一封信而不是作为研究论文发表时提供了标题。这个公式也不是沃斯原创的。当这篇论文发表时,这已经是一个新闻界的会议了(Laplante,1996,p.420)。
One textual note: The phrase “considered harmful,” now widely imitated by computer scientists, was not Dijkstra’s. The editor, Niklaus Wirth, supplied the title when he decided to publish this note (originally entitled “A Case Against the GOTO Statement”) as a letter rather than as a research paper. And the formula was not original with Wirth either; it was already a journalistic convention when this paper was published (Laplante, 1996, p. 420).
编辑:多年来,我一直熟悉这样的观察:程序员的质量是他们编写的程序中go to语句密度的递减函数。最近,我发现了为什么go to语句的使用会产生如此灾难性的影响,并且我开始相信go to语句应该从所有“高级”编程语言(即,除了纯机器代码之外的所有语言)中废除。当时我并没有太重视这个发现;我现在提交我的出版考虑,因为在最近出现这个主题的讨论中,我被敦促这样做。
EDITOR: For a number of years I have been familiar with the observation that the quality of programmers is a decreasing function of the density of go to statements in the programs they produce. More recently I discovered why the use of the go to statement has such disastrous effects, and I became convinced that the go to statement should be abolished from all “higher level” programming languages (i.e., everything except, perhaps, plain machine code). At that time I did not attach too much importance to this discovery; I now submit my considerations for publication because in very recent discussions in which the subject turned up, I have been urged to do so.
我的第一句话是,尽管程序员的活动在他构建了正确的程序后就结束了,但在程序控制下发生的过程才是他活动的真正主题,因为正是这个过程必须达到预期的效果;正是这个过程的动态行为必须满足所需的规范。然而,一旦程序制定完成,相应过程的“制作”就委托给机器了。
My first remark is that, although the programmer’s activity ends when he has constructed a correct program, the process taking place under control of his program is the true subject matter of his activity, for it is this process that has to accomplish the desired effect; it is this process that in its dynamic behavior has to satisfy the desired specifications.Yet, once the program has been made, the “making” of the corresponding process is delegated to the machine.
我的第二句话是,我们的智力更适合掌握静态关系,而我们可视化随时间演变的过程的能力相对发展得相对较差。因此,我们应该(作为明智的程序员意识到我们的局限性)尽最大努力缩短静态程序和动态过程之间的概念差距,使程序(在文本空间中展开)和过程(在文本空间中展开)之间建立对应关系。及时)尽可能微不足道。
My second remark is that our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving in time are relatively poorly developed. For that reason we should do (as wise programmers aware of our limitations) our utmost to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.
现在让我们考虑如何描述流程的进展。(您可能会以非常具体的方式思考这个问题:假设一个过程被视为一系列动作的时间序列,在任意动作之后停止,我们必须修复哪些数据才能重做该过程,直到完全相同的点?)如果程序文本是赋值语句的纯粹串联(出于本次讨论的目的,被视为单个操作的描述),那么在程序文本中指向两个连续操作之间的点就足够了描述。(在没有go to语句的情况下,我可以允许自己在前一句的最后三个单词中存在语法歧义:如果我们将它们解析为“连续(动作描述)”,我们意味着文本空间中的连续;如果我们解析为“(连续的动作)描述”,我们的意思是时间上的连续。)让我们将这样一个指向文本中合适位置的指针称为“文本索引”。
Let us now consider how we can characterize the progress of a process. (You may think about this question in a very concrete manner: suppose that a process, considered as a time succession of actions, is stopped after an arbitrary action, what data do we have to fix in order that we can redo the process until the very same point?) If the program text is a pure concatenation of, say, assignment statements (for the purpose of this discussion regarded as the descriptions of single actions) it is sufficient to point in the program text to a point between two successive action descriptions. (In the absence of go to statements I can permit myself the syntactic ambiguity in the last three words of the previous sentence: if we parse them as “successive (action descriptions)” we mean successive in text space; if we parse as “(successive action) descriptions” we mean successive in time.) Let us call such a pointer to a suitable place in the text a “textual index.”
当我们包含条件子句(if B then A)、替代子句(if B then A 1 else A 2)、CAR Hoare 引入的选择子句(case[i] of ( A 1, A 2, … , An )) ,或 J. McCarthy 引入的条件表达式(B 1 → E 1 , B 2 → E 2 , … , Bn → En),事实仍然是该过程的进展仍然由单个文本索引来表征。
When we include conditional clauses (if B then A), alternative clauses (if B then A1 else A2), choice clauses as introduced by C. A. R. Hoare (case[i] of (A1, A2, …, An)), or conditional expressions as introduced by J. McCarthy (B1 → E1, B2 → E2, …, Bn → En), the fact remains that the progress of the process remains characterized by a single textual index.
一旦我们将其纳入我们的语言程序中,我们就必须承认单个文本索引不再足够。在文本索引指向过程体内部的情况下,只有当我们还给出了我们所引用的过程的调用时,动态进度才会被表征。通过包含过程,我们可以通过一系列文本索引来表征过程的进度,该序列的长度等于过程调用的动态深度。
As soon as we include in our language procedures we must admit that a single textual index is no longer sufficient. In the case that a textual index points to the interior of a procedure body the dynamic progress is only characterized when we also give to which call of the procedure we refer. With the inclusion of procedures we can characterize the progress of the process via a sequence of textual indices, the length of this sequence being equal to the dynamic depth of procedure calling.
现在让我们考虑重复子句(例如,while B 重复 A或重复 A 直到 B)。从逻辑上讲,这样的子句现在是多余的,因为我们可以借助递归过程来表达重复。出于现实的原因,我不想排除它们:一方面,重复条款可以用当今有限的设备相当轻松地实现;另一方面,被称为“归纳”的推理模式使我们能够很好地保持对重复子句生成过程的智力掌握。随着重复子句的加入,文本索引不再足以描述过程的动态进展。然而,对于重复子句的每个条目,我们可以关联一个所谓的“动态索引”,无情地计算相应当前重复的序数。由于重复子句(就像过程调用一样)可以嵌套应用,我们发现现在过程的进度始终可以通过文本和/或动态索引的(混合)序列来唯一地表征。
Let us now consider repetition clauses (like, while B repeat A or repeat A until B). Logically speaking, such clauses are now superfluous, because we can express repetition with the aid of recursive procedures. For reasons of realism I don’t wish to exclude them: on the one hand, repetition clauses can be implemented quite comfortably with present day finite equipment; on the other hand, the reasoning pattern known as “induction” makes us well equipped to retain our intellectual grasp on the processes generated by repetition clauses. With the inclusion of the repetition clauses textual indices are no longer sufficient to describe the dynamic progress of the process. With each entry into a repetition clause, however, we can associate a so-called “dynamic index,” inexorably counting the ordinal number of the corresponding current repetition. As repetition clauses (just as procedure calls) may be applied nestedly, we find that now the progress of the process can always be uniquely characterized by a (mixed) sequence of textual and/or dynamic indices.
要点是这些索引的值不在程序员的控制范围内;无论他是否愿意,它们都会生成(通过编写他的程序或通过过程的动态演变)。它们提供独立的坐标来描述过程的进度。为什么我们需要这样的独立坐标?原因是——这似乎是顺序过程所固有的——我们只能根据过程的进展来解释变量的值。如果我们想要计算最初空房间中的人数,例如n ,我们可以通过每当看到有人进入房间时将n加一来实现。在我们观察到有人进入房间但尚未执行后续n增加的中间时刻,其值等于房间中的人数减一!
The main point is that the values of these indices are outside programmer’s control; they are generated (either by the write-up of his program or by the dynamic evolution of the process) whether he wishes or not. They provide independent coordinates in which to describe the progress of the process. Why do we need such independent coordinates? The reason is—and this seems to be inherent to sequential processes—that we can interpret the value of a variable only with respect to the progress of the process. If we wish to count the number, n say, of people in an initially empty room, we can achieve this by increasing n by one whenever we see someone entering the room. In the in-between moment that we have observed someone entering the room but have not yet performed the subsequent increase of n, its value equals the number of people in the room minus one!
肆无忌惮地使用go to语句会产生一个直接后果,即很难找到一组有意义的坐标来描述流程进度。通常,人们也会考虑一些精心选择的变量的值,但这是不可能的,因为它与理解这些值的含义的进度有关!当然,使用go to语句,我们仍然可以通过计数器计算自程序启动以来执行的操作数量(即一种标准化时钟)来唯一地描述进度。困难在于,这样的坐标虽然是唯一的,但却完全没有帮助。在这样的坐标系中,定义所有这些进展点变得极其复杂,例如,n等于房间中的人数减一!
The unbridled use of the go to statement has an immediate consequence that it becomes terribly hard to find a meaningful set of coordinates in which to describe the process progress. Usually, people take into account as well the values of some well chosen variables, but this is out of the question because it is relative to the progress that the meaning of these values is to be understood! With the go to statement one can, of course, still describe the progress uniquely by a counter counting the number of actions performed since program start (viz. a kind of normalized clock). The difficulty is that such a coordinate, although unique, is utterly unhelpful. In such a coordinate system it becomes an extremely complicated affair to define all those points of progress where, say, n equals the number of persons in the room minus one!
现在的 go to 语句太原始了;邀请人把自己的计划搞得一团糟实在是太过分了。人们可以重视并欣赏那些被视为限制其使用的条款。我并不认为所提到的条款是详尽无遗的,因为它们将满足所有需求,但无论建议什么条款(例如堕胎条款),它们都应该满足这样的要求:可以维护一个独立于程序员的坐标系来描述过程有用且易于管理的方式。
The go to statement as it stands is just too primitive; it is too much an invitation to make a mess of one’s program. One can regard and appreciate the clauses considered as bridling its use. I do not claim that the clauses mentioned are exhaustive in the sense that they will satisfy all needs, but whatever clauses are suggested (e.g. abortion clauses) they should satisfy the requirement that a programmer independent coordinate system can be maintained to describe the process in a helpful and manageable way.
很难以公正的承认来结束这一切。我是否应该判断我的思想受到了谁的影响?很明显,我并非没有受到彼得·兰迪斯和克里斯托弗的影响斯特雷奇。最后,我想记录一下(因为我记得很清楚)Heinz Zemanek 在1959 年初于哥本哈根举行的pre-A LGOL会议上如何明确表达了他对go to语句是否应该与赋值语句同等语法基础上的疑问。在某种程度上,我责怪自己当时没有考虑到他的话的后果。
It is hard to end this with a fair acknowledgment. Am I to judge by whom my thinking has been influenced? It is fairly obvious that I am not uninfluenced by Peter Landis and Christopher Strachey. Finally I should like to record (as I remember it quite distinctly) how Heinz Zemanek at the pre-ALGOL meeting in early 1959 in Copenhagen quite explicitly expressed his doubts whether the go to statement should be treated on equal syntactic footing with the assignment statement. To a modest extent I blame myself for not having then drawn the consequences of his remark.
关于go to语句不可取的言论早已不是什么新鲜事了。我记得读过明确的建议,限制使用go to语句来报警退出,但我无法追踪它;据推测,它是由 CAR Hoare 制造的。Wirth 和 Hoare (1966, §3.21) 在激发案例构造时做了相同的评论:“像条件一样,它比 go to 语句和开关更清楚地反映了程序的动态结构,并且它消除了在节目中引入大量标签。”
The remark about the undesirability of the go to statement is far from new. I remember having read the explicit recommendation to restrict the use of the go to statement to alarm exits, but I have not been able to trace it; presumably, it has been made by C. A. R. Hoare. Wirth and Hoare (1966, §3.21) make a remark in the same direction in motivating the case construction: “Like the conditional, it mirrors the dynamic structure of a program more clearly than go to statements and switches, and it eliminates the need for introducing a large number of labels in the program.”
Böhm 和 Jacopini (1966) 似乎已经证明了go to语句的(逻辑)多余性。然而,不建议将任意流程图或多或少机械地转换为无跳转流程图。那么所得到的流程图就不能指望比原始流程图更透明。
Böhm and Jacopini (1966) seem to have proved the (logical) superfluousness of the go to statement. The exercise to translate an arbitrary flow diagram more or less mechanically into a jumpless one, however, is not to be recommended. Then the resulting flow diagram cannot be expected to be more transparent than the original one.
经计算机协会许可,转载自 Dijkstra (1968a)。
Reprinted from Dijkstra (1968a), with permission from the Association for Computing Machinery.
矩阵乘法是一个如此简单的运算,很难想象还有什么需要学习的。要将两个n × n矩阵A和B相乘并获得n × n乘积矩阵C,请计算A的行与B的列的n 2点积。每个点积都涉及n 次数字乘法和n - 1 次加法,总共n 3 次数字乘法和n 2 ( n - 1) 次加法。还能说什么呢?
Matrix multiplication is such a simple operation that it is hard to imagine there is anything left to learn about it. To multiply two n × n matrices A and B and get an n×n product matrix C, compute the n2 dot products of rows of A with columns of B. Each of those dot products involves n multiplications of numbers and n − 1 additions, for a total of n3 number multiplications and n2(n − 1) additions. What else could there be to say?
事实证明,这很重要。德国数学家 Volker Strassen(生于 1936 年)在发现该算法时可能一直在试图证明一个下界,即n 3 次乘法既是必要的也是充分的。这篇论文提出了两个值得注意的想法。首先,如果有一种方法可以用少于 8 次乘法来计算 2 × 2 矩阵的乘积,则分而治之的递归算法可能会击败传统算法。即使在看到这一点被证明之后,实现递归的开销逐渐得到偿还似乎仍然令人惊讶。另一个惊人的发现是两个 2 × 2 矩阵只需 7 次乘法即可相乘。任何一个高中生都可能在课间在纸上乱写乱画,想到了这一点。几个世纪以来,人们一直在进行矩阵乘法,但没有人注意到,因为没有人有理由去尝试。(Karatsuba 和 Ofman (1962) 提出的一种类似的整数乘法算法已经为人所知。它递归地计算两个 2 n位数字与n位数字的三次乘法的乘积,从而产生O ( n log 2 3 ) ≈ n 1.58次n位乘法算法,优于传统的θ ( n 2 ) 算法。)
A great deal, it turns out. The German mathematician Volker Strassen (b. 1936) may have been trying to prove a lower bound, that n3 multiplications are necessary as well as sufficient, when he discovered this algorithm. The paper entails two remarkable ideas. The first is that a divide-and-conquer, recursive algorithm might beat the conventional algorithm, if there is a way to compute the product of 2 × 2 matrices with fewer than 8 multiplications. Even after seeing this proved, it still seems surprising that the overhead of implementing the recursion is asymptotically repaid. The other amazing discovery is that two 2 × 2 matrices can be multiplied with only 7 multiplications. Any high school student might have figured that out scribbling on a pad of paper between classes; in the centuries that people have been multiplying matrices, nobody noticed because nobody had a reason to try. (An analogous algorithm for integer multiplication, due to Karatsuba and Ofman (1962), was already known. It recursively computes the product of two 2n-bit numbers by three multiplications of n-bit numbers, thus yielding a O(nlog2 3) ≈ n1.58 time algorithm for n-bit multiplications, better than the conventional Θ(n2) algorithm.)
施特拉森的算法很难正确且有效地实现,但它在良好实现下的实用性不仅仅是理论上的。n × n矩阵可以使用n log 2 7 ≈ n 2.8乘法进行相乘这一发现导致了指数可以小多少这一仍未解决的问题。截至撰写本文时,答案不超过 2.373,但不知道大于 2 的下限;然而,这些更奇特的算法实际上并没有什么用处。
Strassen’s algorithm is tricky to implement both correctly and efficiently, but its utility under a good implementation is not merely theoretical. The discovery that n × n matrices can be multiplied using nlog2 7 ≈ n2.8 multiplications led to the still unsolved problem of how much smaller the exponent can be. As of this writing, the answer is no more than 2.373, but no lower bound greater than 2 is known; these more exotic algorithms are not practically useful, however.
这篇论文与 Karatsuba 和 Ofman (1962) 一起,确立了分而治之技术作为解决各种算法问题的工具。有效求解线性方程组的含义(该论文的标题正是由此而来)本身就非常引人注目。
This paper, alongside Karatsuba and Ofman (1962), established the divide-and-conquer technique as a tool for a variety of algorithmic problems. The implications for efficiently solving systems of linear equations—which give the paper its title—are remarkable in their own right.
下面我们将给出一种算法,该算法使用少于4.7· n log 7的算术运算(本文中的所有对数都是基于A和B的系数)来计算两个 n 阶方阵A和B的乘积的系数2,因此 log 7 ≈ 2.8;通常的方法需要大约 2 n 3 次算术运算)。该算法引入了用于反转n阶矩阵、求解n 个未知数的n 个线性方程组、计算n阶行列式等的算法,所有这些都需要少于 const n log 7 的算术运算。
BELOW we will give an algorithm which computes the coefficients of the product of two square matrices A and B of order n from the coefficients of A and B with less than 4.7 · nlog 7 arithmetical operations (all logarithms in this paper are for base 2, thus log 7 ≈ 2.8; the usual method requires approximately 2n3 arithmetical operations). The algorithm induces algorithms for inverting a matrix of order n, solving a system of n linear equations in n unknowns, computing a determinant of order n etc. all requiring less than const nlog 7 arithmetical operations.
这一事实应该与 Klyuev 和 Kokovkin-Shcherbak (1965) 的结果进行比较,即如果将自己限制为对行和列作为一个整体进行运算,那么求解线性方程组的高斯消元法是最佳的。我们还注意到,Winograd(1968)修改了矩阵乘法和求逆以及求解线性方程组的常用算法,将大约一半的乘法换成了加法和减法。我很高兴感谢 D. Brillinger 对当前主题的启发性讨论,感谢 S. Cook 和 B. Parlett 鼓励我写这篇论文。
This fact should be compared with the result of Klyuev and Kokovkin-Shcherbak (1965) that Gaussian elimination for solving a system of linear equations is optimal if one restricts oneself to operations upon rows and columns as a whole. We also note that Winograd (1968) modifies the usual algorithms for matrix multiplication and inversion and for solving systems of linear equations, trading roughly half of the multiplications for additions and subtractions. It is a pleasure to thank D. Brillinger for inspiring discussions about the present subject and S. Cook and B. Parlett for encouraging me to write this paper.
我们定义算法α m, k ,通过k归纳来乘以m 2 k阶矩阵:α m, 0是矩阵乘法的常用算法(需要m 3乘法和m 2 ( m − 1) 加法)。α m, k已知,定义α m, k +1如下:
We define algorithms αm, k which multiply matrices of order m2k, by induction on k: αm, 0 is the usual algorithm for matrix multiplication (requiring m3 multiplications and m2(m − 1) additions). αm, k already being known, define αm, k+1 as follows:
如果A , B是要相乘的m 2 k +1阶矩阵,则写
If A, B are matrices of order m2k+1 to be multiplied, write
其中A ik、B ik、C ik是m 2 k阶矩阵。然后计算
where the Aik, Bik, Cik are matrices of order m2k. Then compute
使用α mk进行乘法,并使用通常的算法进行m 2 k阶矩阵的加法和减法。
using αmk for multiplication and the usual algorithm for addition and subtraction of matrices of order m2k.
通过对k 的归纳,我们很容易看出
By induction on k one easily sees
事实1。α m, k计算两个m 2 k阶矩阵与m 3 7 k乘法以及 (5 + m ) m 2 7 k − 6( m 2 k ) 2加法和减法的乘积。
Fact 1. αm, k computes the product of two matrices of order m2k with m37k multiplications and (5 + m)m27k − 6(m2k)2 additions and subtractions of numbers.
因此,可以将两个 2 k阶矩阵与 7 k次乘法以及少于 6 · 7 k次的加法和减法相乘。
Thus one may multiply two matrices of order 2k with 7k number multiplications and less than 6 · 7k additions and subtractions.
事实2。两个n阶矩阵的乘积可以通过< 4.7 n log 7算术运算来计算。
Fact 2. The product of two matrices of order n may be computed with < 4.7nlog 7 arithmetical operations.
证明。将k = ⌊ log n − 4 ⌋,m = ⌊ n 2 − k ⌋ + 1;则n ≤ m 2 k。将n阶矩阵嵌入到m 2 k阶矩阵中可以将我们的任务简化为估计α m,k的运算次数。根据事实 1,这个数字是
Proof. Put k = ⌊log n − 4⌋, m = ⌊n2−k⌋ + 1; then n ≤ m2k. Imbedding matrices of order n into matrices of order m2k reduces our task to that of estimating the number of operations of αm,k. By Fact 1 this number is
通过凸性论证。
by a convexity argument.
我们现在转向矩阵求逆。要应用下面的算法,不仅需要假设矩阵是可逆的,而且所有发生的除法都有意义(类似的假设对于高斯消除当然是必要的)。
We now turn to matrix inversion. To apply the algorithms below it is necessary to assume not only that the matrix is invertible but that all occurring divisions make sense (a similar assumption is of course necessary for Gaussian elimination).
我们定义算法β m, k,通过对k进行归纳来反转m 2 k阶矩阵:β m, 0是通常的高斯消除算法。β m, k已知,定义β m, k +1如下:
We define algorithms βm, k which invert matrices of order m2k, by induction on k: βm,0 is the usual Gaussian elimination algorithm. βm, k already being known, define βm, k+1 as follows:
如果A是要求逆的m 2 k +1阶矩阵,则写
If A is a matrix of order m2k+1 to be inverted, write
其中A ik、C ik是m 2 k阶矩阵。然后计算
where the Aik, Cik are matrices of order m2k. Then compute
使用α m, k进行乘法,β m, k进行求逆,以及两个m 2 k阶矩阵的加法或减法的常用算法。
using αm, k for multiplication, βm, k for inversion and the usual algorithm for addition or subtraction of two matrices of order m2k.
通过对k 的归纳,我们很容易看出
By induction on k one easily sees
事实 3. β m, k通过m 2 k的除法、乘法、加法和减法来计算m 2 k阶矩阵的逆矩阵。下一个事实与事实 2 的方式相同。
Fact 3. βm, k computes the inverse of a matrix of order m2k with m2k divisions, multiplications and additions and subtractions of numbers. The next Fact follows in the same way as Fact 2.
事实 4. n阶矩阵的逆可以通过< 5.64 · n log 7算术运算来计算。
Fact 4. The inverse of a matrix of order n may be computed with < 5.64 · nlog 7 arithmetical operations.
类似的结果适用于求解线性方程组或计算行列式(使用)。
Similar results hold for solving a system of linear equations or computing a determinant (use ).
经 Springer 许可,转载自 Strassen (1969)。
Reprinted from Strassen (1969), with permission from Springer.
CAR “Tony” Hoare(生于 1934 年)“因其对编程语言的定义和设计的基本贡献”而于 1980 年荣获图灵奖(Hoare,1981)。这一选择展示了他最具影响力的贡献:将 Dijkstra 坚持的计算机编程是数学推理的一个分支的观点正式化的雄心,并实现 John McCarthy (1963) 的议程:“人们应该证明它满足其要求,而不是调试程序。规范,并且该证明应该由计算机程序检查。”
C. A. R. “Tony” Hoare (b. 1934) was recognized with the Turing Award in 1980 “for his fundamental contributions to the definition and design of programming languages” (Hoare, 1981). This selection presents his most influential contribution: the ambition to formalize Dijkstra’s insistence that computer programming was a branch of mathematical reasoning, and to fulfill the agenda of John McCarthy (1963): “Instead of debugging a program, one should prove that it meets its specifications, and this proof should be checked by a computer program.”
霍尔曾在牛津大学接受教育,在那里他学习了现代分析哲学,从而熟悉了数理逻辑。他加入了英国计算机公司 Elliott Brothers,领导团队负责构建 A LGOL 60 编程语言的编译器。那次经历让他接触到了 Edsger Dijkstra,并认识到可以优雅、简洁地表达编程概念的语言的重要性。1968年,他出任贝尔法斯特女王大学教授。在那里,他阅读了罗伯特·弗洛伊德 (Robert Floyd) 的论文“为程序分配含义”(Floyd,1967),该论文描述了一种将不变谓词附加到流程图边缘的方法,从而可以对程序的行为进行严格的分析。霍尔决定摆脱流程图符号,将不变量直接附加到程序语句上,从而使完整的逻辑分析成为可能。
Hoare had been educated at Oxford, where he studied modern analytic philosophy and thereby became familiar with mathematical logic. He joined the British computer company Elliott Brothers, and there led the team responsible for building a compiler for the ALGOL 60 programming language. That experience brought him in contact with Edsger Dijkstra and the importance of languages in which programming concepts could be expressed elegantly and concisely. In 1968 he took up a position as professor at Queen’s University, Belfast. There he read Robert Floyd’s paper “Assigning meanings to programs” (Floyd, 1967), which described a way of attaching invariant predicates to the edges of a flowchart in such a way that the behavior of a program could be subjected to rigorous analysis. Hoare determined to get rid of the flowchart notation and attach the invariants directly to the program statements, thus making a complete logical analysis imaginable.
霍尔将其职业生涯的成熟时间用于解决并发系统编程中的难题,以及设计语言系统以使此类程序更易于编写和验证。但他的名字也将长期与快速排序算法联系在一起,这是一种易于实现的排序算法,以令人惊讶的方式使用递归。快速排序的这种表述(Hoare,1962)是 Hoare 的 A LGOL 60 经验的另一个副产品。通过使递归易于表达,该语言使算法的结构变得显而易见。
Hoare spent the maturity of his career on difficult problems in the programming of concurrent systems, and on the design of language systems to make such programs easier to write and to verify. But his name will also be long associated with the Quicksort algorithm, an easy-to-implement sorting algorithm that uses recursion in a surprising way. This formulation of Quicksort (Hoare, 1962) was another byproduct of Hoare’s ALGOL 60 experience. By making recursion easy to express, the language made apparent the structure of the algorithm.
自霍尔发表开创性论文以来,正式的程序验证一直有起有落。它现在是生产低级代码的标准工具,人类几乎无法阅读这些代码,因此很难理解和推理。它更为雄心勃勃的形式引起了强烈的反感(见第 44 章)。后来,霍尔仍然主张在编程中使用形式化方法,他承认他对大型系统正确性证明的前景过于乐观(Hoare,1996)。“十年前,形式化方法的研究人员(我是其中最错误的一个)预测编程世界将拥抱感谢形式化所承诺的每一项帮助,以解决当程序变得庞大且安全性更高时出现的可靠性问题,”他写道。“项目现在变得非常庞大且非常关键——远远超出了正式方法可以轻松解决的规模。存在许多问题和失败,但这些几乎总是归因于需求分析不充分或管理控制不充分。事实证明,世界并没有因为我们的研究最初打算解决的问题而受到严重影响。”
Formal program verification has had its ups and downs since Hoare’s seminal paper. It is now a standard tool in the production of low-level code, which can be all but unreadable by humans and correspondingly difficult to understand and reason about. In its more ambitious forms it has excited strong antipathy (see chapter 44). Later, still advocating for formal methods in programming, Hoare acknowledged that he had been too sanguine about the prospects for correctness proofs of large systems (Hoare, 1996). “Ten years ago, researchers into formal methods (and I was the most mistaken among them) predicted that the programming world would embrace with gratitude every assistance promised by formalisation to solve the problems of reliability that arise when programs get large and more safety-critical,” he wrote. “Programs have now got very large and very critical—well beyond the scale which can be comfortably tackled by formal methods. There have been many problems and failures, but these have nearly always been attributable to inadequate analysis of requirements or inadequate management control. It has turned out that the world just does not suffer significantly from the kind of problem that our research was originally intended to solve.”
尽管霍尔撰写本文时正在贝尔法斯特女王大学,但他职业生涯的大部分时间是在牛津大学担任教授。他于2000年被封为爵士。
Although he was at Queen’s University in Belfast when he wrote this paper, Hoare spent most of his career as professor at Oxford. He was knighted in the year 2000.
本文试图通过使用首先应用于几何研究、后来扩展到数学其他分支的技术来探索计算机编程的逻辑基础。这涉及阐明可用于证明计算机程序属性的公理集和推理规则。给出了此类公理和规则的示例,并显示了简单定理的形式证明。最后,有人认为,研究这些主题可能会带来理论和实践上的重要优势。
IN this paper an attempt is made to explore the logical foundations of computer programming by use of techniques which were first applied in the study of geometry and have later been extended to other branches of mathematics. This involves the elucidation of sets of axioms and rules of inference which can be used in proofs of the properties of computer programs. Examples are given of such axioms and rules, and a formal proof of a simple theorem is displayed. Finally, it is argued that important advantages, both theoretical and practical, may follow from a pursuance of these topics.
计算机编程是一门精确的科学,原则上,程序的所有属性以及在任何给定环境中执行程序的所有后果都可以通过纯粹的演绎推理从程序本身的文本中找出。演绎推理涉及将有效的推理规则应用于有效的公理集。因此,阐明我们对计算机程序进行推理的公理和推理规则是可取且有趣的。公理的确切选择在某种程度上取决于编程语言的选择。出于说明目的,本文仅限于一种非常简单的语言,它实际上是所有当前面向过程的语言的子集。
Computer programming is an exact science in that all the properties of a program and all the consequences of executing it in any given environment can, in principle, be found out from the text of the program itself by means of purely deductive reasoning. Deductive reasoning involves the application of valid rules of inference to sets of valid axioms. It is therefore desirable and interesting to elucidate the axioms and rules of inference which underlie our reasoning about computer programs. The exact choice of axioms will to some extent depend on the choice of programming language. For illustrative purposes, this paper is confined to a very simple language, which is effectively a subset of all current procedure-oriented languages.
关于程序的有效推理的第一个要求是了解它调用的基本运算的属性,例如整数的加法和乘法。不幸的是,在某些方面,计算机算术与数学家熟悉的算术不同,因此在选择一组适当的公理时必须小心谨慎。例如,图 31.1中显示的公理只是与整数相关的公理的一小部分。从这组不完整的公理可以推导出这样简单的定理为:
The first requirement in valid reasoning about a program is to know the properties of the elementary operations which it invokes, for example, addition and multiplication of integers. Unfortunately, in several respects computer arithmetic is not the same as the arithmetic familiar to mathematicians, and it is necessary to exercise some care in selecting an appropriate set of axioms. For example, the axioms displayed in Figure 31.1 are rather a small selection of axioms relevant to integers. From this incomplete set of axioms it is possible to deduce such simple theorems as:
第二个的证明是:
The proof of the second of these is:
当然,公理 A1 到 A9 对于数学中传统的无限整数集来说是正确的。然而,它们也适用于由计算机操作的有限“整数”集合,前提是它们仅限于非负数。它们的真实性与集合的大小无关。此外,它在很大程度上与“溢出”情况下所采用的技术选择无关;例如:
The axioms A1 to A9 are, of course, true of the traditional infinite set of integers in mathematics. However, they are also true of the finite sets of “integers” which are manipulated by computers provided that they are confined to nonnegative numbers. Their truth is independent of the size of the set; furthermore, it is largely independent of the choice of technique applied in the event of “overflow”; for example:
1、严格解释:溢出运算的结果不存在;当发生溢出时,有问题的程序永远不会完成其操作。请注意,在这种情况下,A1 到 A9 的相等性是严格的,即双方同时存在或不存在。
1. Strict interpretation: the result of an overflowing operation does not exist; when overflow occurs, the offending program never completes its operation. Note that in this case, the equalities of A1 to A9 are strict, in the sense that both sides exist or fail to exist together.
2. 固定边界:将溢出运算的结果作为所表示的最大值。
2. Firm boundary: the result of an overflowing operation is taken as the maximum value represented.
3. 模运算:溢出运算的结果以所表示的整数集的大小为模进行计算。
3. Modulo arithmetic: the result of an overflowing operation is computed modulo the size of the set of integers represented.
这三种技术在图 31.2中通过一个非常小的模型的加法和乘法表进行了说明,其中 0、1、2 和 3 是唯一表示的整数。
These three techniques are illustrated in Figure 31.2 by addition and multiplication tables for a trivially small model in which 0, 1, 2, and 3 are the only integers represented.
有趣的是,满足公理 A1 到 A9 的不同系统可以通过选择一组互斥的补充公理中的特定一个来严格区分。例如,无限算术满足公理:
It is interesting to note that the different systems satisfying axioms A1 to A9 may be rigorously distinguished from each other by choosing a particular one of a set of mutually exclusive supplementary axioms. For example, infinite arithmetic satisfies the axiom:
其中所有有限算术满足:
where all finite arithmetics satisfy:
其中“max”表示所表示的最大整数。
where “max” denotes the largest integer represented.
类似地,溢出的三种处理可以通过选择以下与 max + 1 的值相关的公理之一来区分:
Similarly, the three treatments of overflow may be distinguished by a choice of one of the following axioms relating to the value of max + 1:
选择这些公理之一后,就可以用它来推导程序的属性;然而,这些属性不一定会获得,除非程序在满足所选公理的实现上执行。
Having selected one of these axioms, it is possible to use it in deducing the properties of programs; however, these properties will not necessarily obtain, unless the program is executed on an implementation which satisfies the chosen axiom.
如上所述,本研究的目的是为程序属性的证明提供逻辑基础。程序最重要的属性之一是它是否执行其预期功能。程序或程序的一部分的预期功能可以通过对程序执行后相关变量将采用的值进行一般断言来指定。这些断言通常不会为每个变量赋予特定的值,而是指定这些值的某些一般属性以及它们之间的关系。我们使用数学逻辑的常规符号来表达这些断言,并且尽可能使用熟悉的运算符优先级规则来提高易读性。
As mentioned above, the purpose of this study is to provide a logical basis for proofs of the properties of a program. One of the most important properties of a program is whether or not it carries out its intended function. The intended function of a program, or part of a program, can be specified by making general assertions about the values which the relevant variables will take after execution of the program. These assertions will usually not ascribe particular values to each variable, but will rather specify certain general properties of the values and the relationships holding between them. We use the normal notations of mathematical logic to express these assertions, and the familiar rules of operator precedence have been used wherever possible to improve legibility.
在许多情况下,程序(或程序的一部分)结果的有效性将取决于程序启动之前变量所取的值。这些成功使用的初始先决条件可以通过与用于描述终止时获得的结果相同类型的一般断言来指定。为了说明前提条件 ( P )、程序 ( Q ) 及其执行结果的描述 ( R ) 之间所需的联系,我们引入一种新的符号:
In many cases, the validity of the results of a program (or part of a program) will depend on the values taken by the variables before that program is initiated. These initial preconditions of successful use can be specified by the same type of general assertion as is used to describe the results obtained on termination. To state the required connection between a precondition (P), a program (Q) and a description of the result of its execution (R), we introduce a new notation:
这可以解释为“如果断言P在启动程序Q之前为真,则断言R在程序完成时将为真。” 如果没有施加任何先决条件,我们写成true { Q } R。
This may be interpreted “If the assertion P is true before initiation of a program Q, then the assertion R will be true on its completion.” If there are no preconditions imposed, we write true{Q}R.
下面给出的处理基本上来自 Floyd (1967),但适用于文本而不是流程图。
The treatment given below is essentially due to Floyd (1967) but is applied to texts rather than flowcharts.
x是简单变量的标识符;
x is an identifier for a simple variable;
f是没有副作用的编程语言的表达式,但可能包含x。
f is an expression of a programming language without side effects, but possibly containing x.
现在,在进行赋值之后对于x (的值)为真的任何断言P ( x ) 也必须对于在进行赋值之前采取的表达式f(的值)为真,即使用旧的断言 P ( x ) 。x的值。因此,如果P ( x ) 在赋值之后为 true,则P ( f ) 在赋值之前也必须为 true。这个事实可以更正式地表达:
Now any assertion P(x) which is to be true of (the value of) x after the assignment is made must also have been true of (the value of) the expression f, taken before the assignment is made, i.e. with the old value of x. Thus if P(x) is to be true after the assignment, then P(f) must be true before the assignment. This fact may be expressed more formally:
D0 赋值公理
D0 Axiom of Assignment
在哪里
where
x 是变量标识符;
x is a variable identifier;
f 是一个表达式;
f is an expression;
P 0 是通过用f替换所有出现的x来从P获得的。
P0 is obtained from P by substituting f for all occurrences of x.
可能会注意到,D0 根本不是真正的公理,而是公理模式,描述了共享共同模式的无限公理集。该模式是用纯粹的句法术语描述的,并且很容易检查任何有限文本是否符合该模式,从而有资格作为公理,它可以有效地出现在证明的任何行中。
It may be noticed that D0 is not really an axiom at all, but rather an axiom schema, describing an infinite set of axioms which share a common pattern. This pattern is described in purely syntactic terms, and it is easy to check whether any finite text conforms to the pattern, thereby qualifying as an axiom, which may validly appear in any line of a proof.
D1 后果规则
D1 Rules of Consequence
与组合相关的推理规则规定,如果程序第一部分的已证明结果与程序第二部分产生预期结果的前提相同,则整个程序将产生预期结果,前提是第一部分的前提条件已满足。
The inference rule associated with composition states that if the proven result of the first part of a program is identical with the precondition under which the second part of the program produces its intended result, then the whole program will produce the intended result, provided that the precondition of the first part is satisfied.
用更正式的话来说:
In more formal terms:
D2 组合规则
D2 Rule of Composition
若⊢ P { Q 1 } R 1且⊢ R 1 { Q 2 } R则⊢ P {( Q 1 ; Q 2 )} R
If ⊢ P{Q1}R1 and ⊢ R1{Q2}R then ⊢ P{(Q1; Q2)}R
在执行该语句时,计算机首先测试条件B。如果为假,则省略S,并且循环执行完成。否则,执行S并再次测试B。重复这个动作,直到发现B是假的。导致迭代推理规则的制定的推理如下。假设P是一个断言,在S完成时始终为真,前提是它在启动时也为真。那么显然,在语句S的任意次数的迭代之后(甚至没有迭代),P仍然为真。此外,可知当迭代最终终止时,控制条件B为假。鉴于B可以在S启动时被假设为真这一事实,可以使用稍微更强大的公式:
In executing this statement, a computer first tests the condition B. If this is false, S is omitted, and execution of the loop is complete. Otherwise, S is executed and B is tested again. This action is repeated until B is found to be false. The reasoning which leads to a formulation of an inference rule for iteration is as follows. Suppose P to be an assertion which is always true on completion of S, provided that it is also true on initiation. Then obviously P will still be true after any number of iterations of the statement S (even no iterations). Furthermore, it is known that the controlling condition B is false when the iteration finally terminates. A slightly more powerful formulation is possible in light of the fact that B may be assumed to be true on initiation of S:
D3 迭代规则
D3 Rule of Iteration
该程序的一个重要特性是,当它终止时,我们可以通过将除数y与商q的乘积加到余数 r 上来恢复分子x(即x = r + y × q)。此外,余数小于除数。这些属性可以正式表达:
An important property of this program is that when it terminates, we can recover the numerator x by adding to the remainder r the product of the divisor y and the quotient q (i.e. x = r + y × q). Furthermore, the remainder is less than the divisor. These properties may be expressed formally:
其中Q代表上面显示的程序。这表达了程序“正确性”的必要(但不是充分)条件。
where Q stands for the program displayed above. This expresses a necessary (but not sufficient) condition for the “correctness” of the program.
该定理的正式证明[编辑:在省略的图中]给出。像所有形式证明一样,它过于乏味,并且很容易引入符号约定来显着缩短它。减少形式证明的乏味的一个更有效的方法是从被接受为假设的简单规则中导出证明构造的一般规则。这些一般规则将通过证明在它们的帮助下证明的每个定理如何在没有它们的帮助下同样可以很好地(如果更乏味的话)得到证明来证明是有效的。一旦制定了一套强大的补充规则,“形式证明”就只剩下关于如何构建形式证明的非正式指示。
A formal proof of this theorem is given [EDITOR: in an omitted figure]. Like all formal proofs, it is excessively tedious, and it would be fairly easy to introduce notational conventions which would significantly shorten it. An even more powerful method of reducing the tedium of formal proofs is to derive general rules for proof construction out of the simple rules accepted as postulates. These general rules would be shown to be valid by demonstrating how every theorem proved with their assistance could equally well (if more tediously) have been proved without. Once a powerful set of supplementary rules has been developed, a “formal proof” reduces to little more than an informal indication of how a formal proof could be constructed.
本文引用的公理和推理规则隐含地假设表达式和条件的求值不存在副作用。在证明以允许副作用的语言表达的程序的属性时,有必要在应用适当的证明技术之前在每种情况下证明它们不存在。如果高级编程语言的主要目的是帮助构建和验证正确的程序,那么使用函数符号来调用具有副作用的过程是否具有真正的优势是值得怀疑的。上面引用的公理和规则的另一个缺陷是它们没有提供程序成功终止的证明依据。无法终止可能是由于无限循环;或者可能是由于违反了实现定义的限制,例如数字操作数的范围、存储大小或操作系统时间限制。因此,符号“ P { Q } R ”应该解释为“如果程序成功终止,其结果的属性由R描述”。修改这些公理是相当容易的,这样它们就不能用来预测非终止程序的“结果”;但公理的实际使用现在取决于许多与实现相关的特征的知识,例如计算机的大小和速度、数字的范围以及溢出技术的选择。除了证明避免无限循环之外,最好证明程序的“条件”正确性,并依靠实现来发出警告(如果由于违反了程序而不得不放弃程序的执行)。实施限制。
The axioms and rules of inference quoted in this paper have implicitly assumed the absence of side effects of the evaluation of expressions and conditions. In proving properties of programs expressed in a language permitting side effects, it would be necessary to prove their absence in each case before applying the appropriate proof technique. If the main purpose of a high level programming language is to assist in the construction and verification of correct programs, it is doubtful whether the use of functional notation to call procedures with side effects is a genuine advantage. Another deficiency in the axioms and rules quoted above is that they give no basis for a proof that a program successfully terminates. Failure to terminate may be due to an infinite loop; or it may be due to violation of an implementation-defined limit, for example, the range of numeric operands, the size of storage, or an operating system time limit. Thus the notation “P{Q}R” should be interpreted “provided that the program successfully terminates, the properties of its results are described by R.” It is fairly easy to adapt the axioms so that they cannot be used to predict the “results” of nonterminating programs; but the actual use of the axioms would now depend on knowledge of many implementation-dependent features, for example, the size and speed of the computer, the range of numbers, and the choice of overflow technique. Apart from proofs of the avoidance of infinite loops, it is probably better to prove the “conditional” correctness of a program and rely on an implementation to give a warning if it has had to abandon execution of the program as a result of violation of an implementation limit.
最后,有必要列出一些尚未涵盖的领域:例如,实数算术、位和字符操作、复杂算术、小数算术、数组、记录、覆盖定义、文件、输入/输出、声明、子例程、参数、递归和并行执行。甚至整数算术的表征也远未完成。只要编程语言保持简单,处理这些问题似乎没有什么太大的困难。真正困难的领域是标签和跳转、指针和名称参数。利用这些特征的程序的证明可能很复杂,这应该反映在基础公理的复杂性中也就不足为奇了。
Finally it is necessary to list some of the areas which have not been covered: for example, real arithmetic, bit and character manipulation, complex arithmetic, fractional arithmetic, arrays, records, overlay definition, files, input/output, declarations, subroutines, parameters, recursion, and parallel execution. Even the characterization of integer arithmetic is far from complete. There does not appear to be any great difficulty in dealing with these points, provided that the programming language is kept simple. Areas which do present real difficulty are labels and jumps, pointers, and name parameters. Proofs of programs which made use of these features are likely to be elaborate, and it is not surprising that this should be reflected in the complexity of the underlying axioms.
程序最重要的属性是它是否实现了用户的意图。如果这些意图可以通过在程序执行结束时(或中间点)对变量值进行断言来严格描述,那么本文描述的技术可以用来证明程序的正确性,前提是编程语言的实现符合证明中使用的公理和规则。这个事实本身也可以通过演绎推理来建立,使用描述硬件电路的逻辑属性的公理集。当程序、其编译器和计算机硬件的正确性都以数学确定性确定时,就可以极大地依赖程序的结果,并以仅受以下限制的置信度来预测它们的属性:电子设备的可靠性。
The most important property of a program is whether it accomplishes the intentions of its user. If these intentions can be described rigorously by making assertions about the values of variables at the end (or at intermediate points) of the execution of the program, then the techniques described in this paper may be used to prove the correctness of the program, provided that the implementation of the programming language conforms to the axioms and rules which have been used in the proof. This fact itself might also be established by deductive reasoning, using an axiom set which describes the logical properties of the hardware circuits. When the correctness of a program, its compiler, and the hardware of the computer have all been established with mathematical certainty, it will be possible to place great reliance on the results of the program, and predict their properties with a confidence limited only by the reliability of the electronics.
在更强大的证明技术出现之前,为非平凡程序提供证明的做法不会变得普遍,即使如此,也不容易。但鉴于编程错误的成本不断增加,程序证明的实际优势最终将超过困难。目前,程序员用来说服自己程序正确性的方法是在特定情况下进行尝试,如果产生的结果不符合他的意图,则进行修改。在他找到了该程序似乎有效的相当广泛的示例案例后,他相信它永远有效。这个程序测试所花费的时间往往超过整个编程项目所花费时间的一半以上;根据实际的机器时间成本计算,项目成本的三分之二(或更多)用于消除此阶段的错误。
The practice of supplying proofs for nontrivial programs will not become widespread until considerably more powerful proof techniques become available, and even then will not be easy. But the practical advantages of program proving will eventually outweigh the difficulties, in view of the increasing costs of programming error. At present, the method which a programmer uses to convince himself of the correctness of his program is to try it out in particular cases and to modify it if the results produced do not correspond to his intentions. After he has found a reasonably wide variety of example cases on which the program seems to work, he believes that it will always work. The time spent in this program testing is often more than half the time spent on the entire programming project; and with a realistic costing of machine time, two thirds (or more) of the cost of the project is involved in removing errors during this phase.
消除程序投入使用后发现的错误的成本通常更高,特别是对于计算机制造商的软件项目来说,其大部分费用由用户承担。最后,某些类型的项目中的错误成本可能几乎无法估量——丢失的航天器、倒塌的建筑物、坠毁的飞机或世界大战。因此,程序证明的实践不仅是一种为了学术尊严而遵循的理论追求,而且是对减少与编程错误相关的成本的认真建议。
The cost of removing errors discovered after a program has gone into use is often greater, particularly in the case of items of computer manufacturer’s software for which a large part of the expense is borne by the user. And finally, the cost of error in certain types of program may be almost incalculable—a lost spacecraft, a collapsed building, a crashed aeroplane, or a world war. Thus the practice of program proving is not only a theoretical pursuit, followed in the interests of academic respectability, but a serious recommendation for the reduction of the costs associated with programming error.
证明程序的做法可能会缓解困扰计算世界的其他一些问题。例如,存在程序文档的问题,首先,它必须告知子程序的潜在用户如何使用它以及它完成什么任务,其次,当需要更新程序时,协助进一步的开发适应不断变化的环境或根据增加的知识来改进环境。阐述子程序的目的及其正确使用条件的最严格方法是在其执行之前和之后对变量的值进行断言。这些断言正确性的证明可以用作调用子例程的任何程序的证明中的引理。因此,在一个大型程序中,整体的结构可以清楚地反映在其证明的结构中。此外,当需要修改程序时,用满足相同正确性标准的另一个子程序替换任何子程序总是有效的。最后,在检查算法的细节时,证据似乎不仅有助于解释正在发生的事情,而且有助于解释为什么发生。
The practice of proving programs is likely to alleviate some of the other problems which afflict the computing world. For example, there is the problem of program documentation, which is essential, firstly, to inform a potential user of a subroutine how to use it and what it accomplishes, and secondly, to assist in further development when it becomes necessary to update a program to meet changing circumstances or to improve it in the light of increased knowledge. The most rigorous method of formulating the purpose of a subroutine, as well as the conditions of its proper use, is to make assertions about the values of variables before and after its execution. The proof of the correctness of these assertions can then be used as a lemma in the proof of any program which calls the subroutine. Thus, in a large program, the structure of the whole can be clearly mirrored in the structure of its proof. Furthermore, when it becomes necessary to modify a program, it will always be valid to replace any subroutine by another which satisfies the same criterion of correctness. Finally, when examining the detail of the algorithm, it seems probable that the proof will be helpful in explaining not only what is happening but why.
通过程序证明的实践可以解决的另一个问题(只要它是可解决的)就是将程序从一种计算机设计转移到另一种计算机设计。即使用所谓的与机器无关的编程语言编写,许多大型程序也会无意中利用特定实现的某些与机器相关的属性,并且在尝试将其传输到另一台机器时可能会导致令人不快且昂贵的意外。然而,如果尝试从与机器无关的公理证明程序失败,机器相关特征的存在总是会提前暴露出来。然后,程序员可以选择以独立于机器的方式制定他的算法,可能会在环境查询的帮助下;或者,如果这涉及太多努力或效率低下,他可以故意构建一个与机器相关的程序,并依靠一些与机器相关的公理来证明他的证明,例如 A11 的版本之一(第 300 页)。在后一种情况下,必须明确引用该公理作为成功使用该程序的前提之一。该程序仍然可以完全放心地转移到任何其他恰好满足相同机器相关公理的机器;但是,如果有必要将其转移到不需要的实现中,那么所有需要更改的地方都将通过以下事实清楚地注释:此时的证明诉诸于令人不快的依赖于机器的公理的真实性。
Another problem which can be solved, insofar as it is soluble, by the practice of program proofs is that of transferring programs from one design of computer to another. Even when written in a so-called machine-independent programming language, many large programs inadvertently take advantage of some machine-dependent property of a particular implementation, and unpleasant and expensive surprises can result when attempting to transfer it to another machine. However, presence of a machine-dependent feature will always be revealed in advance by the failure of an attempt to prove the program from machine-independent axioms. The programmer will then have the choice of formulating his algorithm in a machine-independent fashion, possibly with the help of environment enquiries; or if this involves too much effort or inefficiency, he can deliberately construct a machine-dependent program, and rely for his proof on some machine-dependent axiom, for example, one of the versions of A11 (page 300). In the latter case, the axiom must be explicitly quoted as one of the preconditions of successful use of the program. The program can still, with complete confidence, be transferred to any other machine which happens to satisfy the same machine-dependent axiom; but if it becomes necessary to transfer it to an implementation which does not, then all the places where changes are required will be clearly annotated by the fact that the proof at that point appeals to the truth of the offending machine-dependent axiom.
因此,证明程序的实践似乎可以解决软件和编程中三个最紧迫的问题,即可靠性、文档记录和兼容性。然而,目前来看,即使对于高素质的程序员来说,程序验证也很困难。并且可能仅适用于非常简单的程序设计。与其他领域一样,可靠性只能以简单性为代价。
Thus the practice of proving programs would seem to lead to solution of three of the most pressing problems in software and programming, namely, reliability, documentation, and compatibility. However, program proving, certainly at present, will be difficult even for programmers of high caliber; and may be applicable only to quite simple program designs. As in other areas, reliability can be purchased only at the price of simplicity.
高级编程语言(例如 A LGOL、FORTRAN或 CO OBOL)通常旨在在不同大小、配置和设计的各种计算机上实现。它有人们发现,以足够严格的方式定义这些语言以确保所有实现者之间的兼容性是一个严重的问题。由于兼容性的目的是促进用该语言表达的程序的互换,因此实现这一目标的一种方法是坚持该语言的所有实现都应“满足”公理和推理规则,这些公理和推理规则是所表达的程序的属性证明的基础语言中,以便基于这些证明的所有预测都将得到实现,除非发生硬件故障。实际上,这相当于接受公理和推理规则作为语言含义的最终确定性规范。
A high level programming language, such as ALGOL, FORTRAN, or COBOL, is usually intended to be implemented on a variety of computers of differing size, configuration, and design. It has been found a serious problem to define these languages with sufficient rigour to ensure compatibility among all implementors. Since the purpose of compatibility is to facilitate interchange of programs expressed in the language, one way to achieve this would be to insist that all implementations of the language shall “satisfy” the axioms and rules of inference which underlie proofs of the properties of programs expressed in the language, so that all predictions based on these proofs will be fulfilled, except in the event of hardware failure. In effect, this is equivalent to accepting the axioms and rules of inference as the ultimately definitive specification of the meaning of the language.
除了为实现的正确性提供直接的、甚至可能可证明的标准之外,用于定义编程语言语义的公理技术似乎类似于 A LGOL 60 报告的形式语法,因为它足够简单易于理解由该语言的实现者和相当熟练的用户来执行。只有通过在单个文档中弥合这种不断扩大的沟通差距(甚至可能证明是一致的),才能从正式的语言定义中获得最大的优势。
Apart from giving an immediate and possibly even provable criterion for the correctness of an implementation, the axiomatic technique for the definition of programming language semantics appears to be like the formal syntax of the ALGOL 60 report, in that it is sufficiently simple to be understood both by the implementor and by the reasonably sophisticated user of the language. It is only by bridging this widening communication gap in a single document (perhaps even provably consistent) that the maximum advantage can be obtained from a formal language definition.
使用公理化方法的另一个巨大优点是,公理提供了一种简单而灵活的技术,可以使语言的某些方面保持未定义状态,例如整数范围、浮点精度和溢出技术的选择。这对于标准化目的绝对是必要的,因为否则该语言将无法在不同的硬件设计上有效地实现。因此,编程语言标准应该包含一组具有普遍适用性的公理,以及一组描述实现者面临的选择范围的补充公理中的选择。第31.2节中给出了为此目的使用公理的示例。
Another of the great advantages of using an axiomatic approach is that axioms offer a simple and flexible technique for leaving certain aspects of a language undefined, for example, range of integers, accuracy of floating point, and choice of overflow technique. This is absolutely essential for standardization purposes, since otherwise the language will be impossible to implement efficiently on differing hardware designs. Thus a programming language standard should consist of a set of axioms of universal applicability, together with a choice from a set of supplementary axioms describing the range of choices facing an implementor. An example of the use of axioms for this purpose was given in §31.2.
形式语言定义的另一个目标是帮助设计更好的编程语言。A LGOL 60 语法的规律性、清晰度和易于实现可能至少部分归因于对其定义使用了优雅的形式技术。公理的使用可能会在“语义”领域带来类似的优势,因为似乎可以用一些“不言而喻”的公理来描述的语言,从这些公理中构造证明相对容易,这似乎比这种语言更可取。一种具有许多难以应用于证明的晦涩公理的语言。此外,公理使语言设计者能够非常简单和直接地表达他的一般意图,而无需通常伴随算法描述的大量细节。最后,公理可以以基本上彼此独立的方式表述,以便设计者可以自由地研究一个公理或一组公理,而不必担心与语言其他部分的意外交互影响。……
Another of the objectives of formal language definition is to assist in the design of better programming languages. The regularity, clarity, and ease of implementation of the ALGOL 60 syntax may at least in part be due to the use of an elegant formal technique for its definition. The use of axioms may lead to similar advantages in the area of “semantics,” since it seems likely that a language which can be described by a few “self-evident” axioms from which proofs will be relatively easy to construct will be preferable to a language with many obscure axioms which are difficult to apply in proofs. Furthermore, axioms enable the language designer to express his general intentions quite simply and directly, without the mass of detail which usually accompanies algorithmic descriptions. Finally, axioms can be formulated in a manner largely independent of each other, so that the designer can work freely on one axiom or group of axioms without fear of unexpected interaction effects with other parts of the language. …
经计算机协会许可,转载自 Hoare (1969)。
Reprinted from Hoare (1969), with permission from the Association for Computing Machinery.
学术界普遍认为计算科学是关于算法的。但在商业中,计算始终与数据有关。当然,这两种观点并不对立,但从面向数据的角度来看世界与从面向算法的角度来看世界有很大不同。一位数据库科学家告诉我,数据就像海洋,深邃、永恒、神秘,而算法只是掠过水面的小船。
It is common in academic circles to think that the science of computing is about algorithms. But in business, computing has always been about data. Of course the two perspectives are not opposed, but the world looks very different from a data-oriented view than from an algorithm-oriented view. A database scientist told me that data was the ocean, deep, eternal, and mysterious, and algorithms were just boats skimming its surface.
直到 20 世纪 60 年代,计算机的大多数实际应用都是计算数字——数学函数的值(例如艾肯的 Mark I 生成的表格)或物理现象的参数(例如天文计算,或者原子弹的预测行为) )。当然,任何涉及物理现象的计算都需要数值数据,但早期的计算机没有足够的存储空间来处理大量数据。对计算的科学兴趣是由执行数值计算的需要以及丘奇和图灵对算法本体论的非凡发现所驱动的。图灵在布莱奇利公园的出色密码破译工作是由极少量数据(截获的加密消息)驱动的逻辑练习和精心控制的组合搜索。
Until the 1960s, most practical applications of computers were for computing numbers—either values of mathematical functions (such as the tables generated by Aiken’s Mark I) or parameters of physical phenomena (astronomical calculations, for example, or the predicted behavior of an atom bomb). Of course any calculation involving physical phenomena requires numerical data, but early computers did not have enough storage to process large amounts of it. The scientific interest in computing was driven by the need to perform numerical calculations, plus the extraordinary discovery by Church and Turing of the ontology of algorithms. Turing’s remarkable codebreaking work at Bletchley Park was an exercise of logic and carefully controlled combinatorial search driven by very small amounts of data (intercepted encrypted messages).
当然,Vannevar Bush 已经预见到了大规模数据操作的重要性(“选择设备……很快就会从目前每分钟数百个数据的审查速度加快”,第 115 页)。霍华德·艾肯 (Howard Aiken) 和格蕾丝·霍珀 (Grace Hopper) 预见到了商业应用程序的重要性,国际商业机器公司 (International Business Machines Corporation) 从为美国人口普查提供制表机的公司发展而来,开始主导计算机商业市场。从 20 世纪 60 年代中期开始,IBM 开发了一种数据库产品来管理阿波罗太空计划的库存。该系统的数据模型最初被称为 IMS,最终借鉴了图论:机械零件有子零件,相同的子零件可能用于多个不同的较大组件中,因此数据实体之间的连接类似于连接有向图的节点。
Certainly Vannevar Bush had foreseen the importance of large-scale data manipulation (“Selection devices … will soon be speeded up from their present rate of reviewing data at a few hundred a minute,” page 115). Howard Aiken and Grace Hopper anticipated the importance of business applications, and the International Business Machines Corporation, which had grown out of companies supplying tabulating machines for the U.S. Census, came to dominate the business market for computers. Starting in the mid-1960s, IBM developed a database product to manage inventory for the Apollo space program. The data model of this system, known originally as IMS, drew ultimately on graph theory: mechanical parts had sub-parts, and the same sub-parts might be used in several different larger assemblies, so the connection between data entities resembled the links connecting nodes of a directed graph.
此类系统管理的数据量稳步增长,并且很明显,描述和查询数据的方式不需要与用于存储数据的内存结构相对应。就像编译器可以从执行算法的机器代码的细节中抽象出算法的语句一样,需要以某种形式上的清晰性来讨论数据,而将最优的问题留给计算机系统本身。在物理设备中组织数据以实现最有效的存储和检索。
The quantity of data managed by such systems steadily grew, and it became clear that the way data were described and queried need not correspond to the memory structures used to store it. In much the same way as compilers had made it possible to abstract the statement of algorithms from the particulars of the machine code that executed them, there was a need to talk about data with some formal clarity, leaving to computer systems themselves the problem of optimally organizing the data in physical devices for most effective storage and retrieval.
对于埃德加·弗兰克·科德(Edgar Frank Codd,1923-2003)来说,谈论数据的方式是使用源自谓词演算的语言。数据将被组织为关系。关系是一组n元组,例如一组有序三元组或有序四元组,其中每个位置或“列”包含特定类型的数据。该关系可以描述为一个表,每行一个n元组,但表中行的顺序在语义上是无关的,因为n元组在逻辑上是一组。
For Edgar Frank Codd (1923–2003), the way to talk about data was to use a language derived from the predicate calculus. Data was to be organized as relations. A relation is a set of n-tuples, for example a set of ordered triples or ordered quadruples, where each position or “column” contains data of a particular type. The relation can be depicted as a table, with one n-tuple per row, but the order of the rows in the table is semantically irrelevant because the n-tuples are logically a set.
在这篇开创性的论文中,科德阐述了数据关系视图的基础知识,最重要的是,开发了一种用于组合关系的关系代数。Codd 的大部分职业生涯都在 IBM 度过,这项工作源于他对现有数据库系统将数据库的逻辑结构(以及访问数据库的程序的逻辑)与其内部“物理”表示纠缠在一起的挫败感。IBM 对 Codd 的创新并不热心,也许是因为成功的关系数据库系统会与现有的 IBM 产品竞争,所以 Codd 离开并创办了自己的公司。关系数据库模型的研究实现在 20 世纪 70 年代后期开始出现,包括 IBM 的 System R 和伯克利的 Michael Stonebraker 的 Ingres 系统。Oracle Corporation、Tandem Computers、Stonebraker's Relational Technology Inc. 和 IBM 在 1979 年至 1981 年间都发布了商业实现,将该模型确立为事实上的标准。该模型和相关的数据管理语言 SQL 现在无处不在,Codd 因其贡献而获得了图灵奖。
In this seminal paper Codd worked out the basics of the relational view of data, and most importantly, developed a relational algebra for combining relations. Codd spent much of his career with IBM, and this work emerged from his frustration that existing database systems entangled the database’s logical structure (and thus the logic of programs accessing the database) with its internal “physical” representation. IBM welcomed Codd’s innovations only tepidly, perhaps because a successful relational database system would compete with existing IBM products, so Codd left to start his own firm. Research implementations of the relational database model began to appear later in the 1970s, with IBM’s System R and the Ingres system of Michael Stonebraker at Berkeley. Oracle Corporation, Tandem Computers, Stonebraker’s Relational Technology Inc., and IBM all released commercial implementations between 1979 and 1981, establishing the model as a de facto standard. The model and the associated data management language SQL are now ubiquitous, and Codd was recognized with the Turing award for his contribution.
本文涉及基本关系理论在提供对大量格式化数据的共享访问的系统中的应用。除了 Childs (1968) 的一篇论文外,关系在数据系统中的主要应用是演绎问答系统。Levien 和 Maron (1967) 提供了该领域工作的大量参考资料。
THIS paper is concerned with the application of elementary relation theory to systems which provide shared access to large banks of formatted data. Except for a paper by Childs (1968), the principal application of relations to data systems has been to deductive question-answering systems. Levien and Maron (1967) provide numerous references to work in this area.
相比之下,这里处理的问题是数据独立性的问题——应用程序和终端活动独立于数据类型的增长和数据表示的变化——以及某些类型的数据不一致,即使在非演绎系统中,这些数据不一致也会变得麻烦。
In contrast, the problems treated here are those of data independence—the independence of application programs and terminal activities from growth in data types and changes in data representation—and certain kinds of data inconsistency which are expected to become troublesome even in nondeductive systems.
第32.1节中描述的数据的关系视图(或模型)似乎在几个方面优于目前非推理系统流行的图或网络模型(Bachman,1965;McGee,1969)。它提供了一种仅使用其自然结构来描述数据的方法,即不为机器表示目的叠加任何附加结构。因此,它为高级数据语言提供了基础,该语言将一方面在程序之间产生最大的独立性,另一方面在数据的机器表示和组织之间产生最大的独立性。
The relational view (or model) of data described in §32.1 appears to be superior in several respects to the graph or network model (Bachman, 1965; McGee, 1969) presently in vogue for noninferential systems. It provides a means of describing data with its natural structure only—that is, without superimposing any additional structure for machine representation purposes. Accordingly, it provides a basis for a high level data language which will yield maximal independence between programs on the one hand and machine representation and organization of data on the other.
关系视图的另一个优点是它为处理关系的可导性、冗余性和一致性奠定了坚实的基础——这些在第32.2节中讨论。网络模型,关于另一方面,也产生了许多混乱,其中最重要的是将联系的推导误认为是关系的推导……。
A further advantage of the relational view is that it forms a sound basis for treating derivability, redundancy, and consistency of relations—these are discussed in §32.2. The network model, on the other hand, has spawned a number of confusions, not the least of which is mistaking the derivation of connections for the derivation of relations ….
最后,关系视图允许更清晰地评估当前格式化数据系统的范围和逻辑限制,以及单个系统内数据竞争表示的相对优点(从逻辑角度来看)。本文的各个部分都引用了这种更清晰观点的例子。没有讨论支持关系模型的系统的实现。
Finally, the relational view permits a clearer evaluation of the scope and logical limitations of present formatted data systems, and also the relative merits (from a logical standpoint) of competing representations of data within a single system. Examples of this clearer perspective are cited in various parts of this paper. Implementations of systems to support the relational model are not discussed.
32.1.2.1 排序相关性 数据库中的数据元素可以以多种方式存储,有些不涉及排序,有些允许每个元素仅参与一个排序,其他则允许每个元素参与多个排序。让我们考虑那些要求或允许数据元素以至少一种与硬件确定的地址排序密切相关的全排序来存储的现有系统。例如,有关零件的文件记录可能按零件序列号的升序存储。这样的系统通常允许应用程序假设来自这样的文件的记录的呈现顺序与所存储的顺序相同(或者是其子顺序)。如果由于某种原因需要用不同的顺序替换那些利用文件的存储顺序的那些应用程序很可能无法正确运行。类似的评论也适用于通过指针实现的存储排序。
32.1.2.1 Ordering dependence Elements of data in a data bank may be stored in a variety of ways, some involving no concern for ordering, some permitting each element to participate in one ordering only, others permitting each element to participate in several orderings. Let us consider those existing systems which either require or permit data elements to be stored in at least one total ordering which is closely associated with the hardware-determined ordering of addresses. For example, the records of a file concerning parts might be stored in ascending order by part serial number. Such systems normally permit application programs to assume that the order of presentation of records from such a file is identical to (or is a subordering of) the stored ordering. Those application programs which take advantage of the stored ordering of a file are likely to fail to operate correctly if for some reason it becomes necessary to replace that ordering by a different one. Similar remarks hold for a stored ordering implemented by means of pointers.
没有必要单独举出任何系统作为例子,因为当今市场上所有众所周知的信息系统都未能明确区分呈现顺序和存储顺序。必须解决重大的实施问题才能提供这种独立性。
It is unnecessary to single out any system as an example, because all the well-known information systems that are marketed today fail to make a clear distinction between order of presentation on the one hand and stored ordering on the other. Significant implementation problems must be solved to provide this kind of independence.
32.1.2.2 索引依赖性 在格式化数据的上下文中,索引通常被认为是数据表示的纯粹面向性能的组件。它往往会提高对查询和更新的响应,同时减慢对插入和删除的响应。从信息的角度来看,索引是数据表示的冗余组件。如果系统完全使用索引,并且要在数据库活动模式不断变化的环境中表现良好,则需要能够随时创建和销毁索引到时候可能是必要的。那么问题来了:应用程序和终端活动能否随着指数的变化而保持不变?
32.1.2.2 Indexing dependence In the context of formatted data, an index is usually thought of as a purely performance-oriented component of the data representation. It tends to improve response to queries and updates and, at the same time, slow down response to insertions and deletions. From an informational standpoint, an index is a redundant component of the data representation. If a system uses indices at all and if it is to perform well in an environment with changing patterns of activity on the data bank, an ability to create and destroy indices from time to time will probably be necessary. The question then arises: Can application programs and terminal activities remain invariant as indices come and go?
当前的格式化数据系统采用截然不同的索引方法。TDMS(Bleier,1967)无条件地提供所有属性的索引。目前发布的 IMS 版本(IBM,1965b)为用户提供了对每个文件的选择:完全不建立索引(分层顺序组织)或仅在主键上建立索引(分层索引顺序组织)之间的选择。在这两种情况下,用户的应用程序逻辑都不依赖于无条件提供的索引的存在。然而,IDS 允许文件设计者选择要索引的属性,并通过附加链将索引合并到文件结构中。利用这些索引链的性能优势的应用程序必须按名称引用这些链。如果以后删除这些链,此类程序将无法正确运行。
Present formatted data systems take widely different approaches to indexing. TDMS (Bleier, 1967) unconditionally provides indexing on all attributes. The presently released version of IMS (IBM, 1965b) provides the user with a choice for each file: a choice between no indexing at all (the hierarchic sequential organization) or indexing on the primary key only (the hierarchic indexed sequential organization). In neither case is the user’s application logic dependent on the existence of the unconditionally provided indices. IDS, however, permits the file designers to select attributes to be indexed and to incorporate indices into the file structure by means of additional chains. Application programs taking advantage of the performance benefit of these indexing chains must refer to those chains by name. Such programs do not operate correctly if these chains are later removed.
32.1.2.3 访问路径依赖 许多现有的格式化数据系统为用户提供树形结构文件或稍微更通用的数据网络模型。如果树或网络的结构发生变化,则为与这些系统配合使用而开发的应用程序往往会在逻辑上受到损害。下面是一个简单的例子。
32.1.2.3 Access path dependence Many of the existing formatted data systems provide users with tree-structured files or slightly more general network models of the data. Application programs developed to work with these systems tend to be logically impaired if the trees or networks are changed in structure. A simple example follows.
假设数据库包含有关零件和项目的信息。对于每个零件,记录零件编号、零件名称、零件描述、现有数量和订单数量。对于每个项目,记录了项目编号、项目名称、项目描述。每当项目使用某个部件时,也会记录该部件致力于给定项目的数量。假设系统要求用户或文件设计者以树结构来声明或定义数据。那么,上述信息可以采用任何一种层次结构(参见结构1-5,图32.1-32.5)。
Suppose the data bank contains information about parts and projects. For each part, the part number, part name, part description, quantity-on-hand, and quantity-on-order are recorded. For each project, the project number, project name, project description are recorded. Whenever a project makes use of a certain part, the quantity of that part committed to the given project is also recorded. Suppose that the system requires the user or file designer to declare or define the data in terms of tree structures. Then, any one of the hierarchical structures may be adopted for the information mentioned above (see Structures 1–5, Figures 32.1–32.5).
图 32.1: 结构 1. 隶属于各部分的项目
Figure 32.1: Structure 1. Projects subordinate to parts
图 32.2: 结构 2. 项目所属部分
Figure 32.2: Structure 2. Parts subordinate to projects
图 32.3: 结构 3. 零件和项目是对等的,承诺关系从属于项目
Figure 32.3: Structure 3. Parts and projects as peers, commitment relationship subordinate to projects
图 32.4: 结构 4. 零件和项目对等,承诺关系从属于零件
Figure 32.4: Structure 4. Parts and projects as peers, commitment relationship subordinate to parts
图 32.5: 结构 5. 零件、项目和同等的承诺关系
Figure 32.5: Structure 5. Parts, projects, and commitment relationship as peers
现在,考虑打印出项目名称为“alpha”的项目中使用的每个零件的零件编号、零件名称和数量的问题。无论选择哪种可用的面向树的信息系统来解决这个问题,都可以进行以下观察。如果针对此问题开发的程序P假设上述五种结构之一(即P不进行测试来确定哪种结构有效),那么P将在至少三个剩余结构上失败。更具体地说,如果结构 5 P成功,它将和其他人一样失败;如果P在结构 3 或 4 上成功,那么它至少会在结构 1、2 和 5 上失败;如果P在 1 或 2 上成功,那么它至少会在 3、4 和 5 上失败。每种情况的原因都很简单。在没有测试来确定哪个结构有效的情况下,P会失败,因为尝试执行对不存在文件的引用(可用系统将此视为错误)或未尝试执行对文件的引用包含所需的信息。不相信的读者应该为这个简单的问题开发示例程序。
Now, consider the problem of printing out the part number, part name, and quantity committed for every part used in the project whose project name is “alpha.” The following observations may be made regardless of which available tree-oriented information system is selected to tackle this problem. If a program P is developed for this problem assuming one of the five structures above—that is, P makes no test to determine which structure is in effect—then P will fail on at least three of the remaining structures. More specifically, if P succeeds with structure 5, it will fail with all the others; if P succeeds with structure 3 or 4, it will fail with at least 1, 2, and 5; if P succeeds with 1 or 2, it will fail with at least 3, 4, and 5. The reason is simple in each case. In the absence of a test to determine which structure is in effect, P fails because an attempt is made to execute a reference to a nonexistent file (available systems treat this as an error) or no attempt is made to execute a reference to a file containing needed information. The reader who is not convinced should develop sample programs for this simple problem.
一般来说,由于开发测试系统允许的所有树结构的应用程序是不切实际的,因此当需要改变结构时这些程序就会失败。
Since, in general, it is not practical to develop application programs which test for all tree structurings permitted by the system, these programs fail when a change in structure becomes necessary.
为用户提供数据网络模型的系统也遇到了类似的困难。在树和网络情况下,用户(或其程序)都需要利用数据的用户访问路径的集合。这些路径是否与存储表示中的指针定义路径密切对应并不重要——在IDS中,对应关系非常简单,在TDMS中则恰恰相反。无论存储的表示如何,结果是终端活动和程序变得依赖于用户访问路径的持续存在。对此的一种解决方案是采用这样的策略:一旦定义了用户访问路径,则在使用该路径的所有应用程序都已过时之前,该路径不会过时。这样的策略是不切实际的,因为数据库用户社区的整个模型中的访问路径的数量最终会变得过大。
Systems which provide users with a network model of the data run into similar difficulties. In both the tree and network cases, the user (or his program) is required to exploit a collection of user access paths to the data. It does not matter whether these paths are in close correspondence with pointer-defined paths in the stored representation—in IDS the correspondence is extremely simple, in TDMS it is just the opposite. The consequence, regardless of the stored representation, is that terminal activities and programs become dependent on the continued existence of the user access paths. One solution to this is to adopt the policy that once a user access path is defined it will not be made obsolete until all application programs using that path have become obsolete. Such a policy is not practical, because the number of access paths in the total model for the community of users of a data bank would eventually become excessively large.
出于说明的原因,我们将经常使用关系的数组表示,但必须记住,这种特定的表示并不是正在说明的关系视图的重要组成部分。表示n元关系R的数组具有以下属性:
For expository reasons, we shall frequently make use of an array representation of relations, but it must be remembered that this particular representation is not an essential part of the relational view being expounded. An array which represents an n-ary relation R has the following properties:
1. 每行代表R 的一个n 元组。
1. Each row represents an n-tuple of R.
2. 行的顺序并不重要。
2. The ordering of rows is immaterial.
3. 所有行都是不同的。
3. All rows are distinct.
4. 列的排序很重要——它对应于定义R的域的排序S 1 , S 2 , … , S n(不过,请参见下面关于域有序和域无序关系的注释)。
4. The ordering of columns is significant—it corresponds to the ordering S1, S2, …, Sn of the domains on which R is defined (see, however, remarks below on domain-ordered and domain-unordered relations).
5. 每列的重要性通过用相应域的名称进行标记来部分传达。
5. The significance of each column is partially conveyed by labeling it with the name of the corresponding domain.
图 32.6中的示例说明了 4 级关系,称为“供应”,它反映了从指定供应商到指定项目的指定数量的零件运输进度。
The example in Figure 32.6 illustrates a relation of degree 4, called supply, which reflects the shipments-in-progress of parts from specified suppliers to specified projects in specified quantities.
图 32.6: 4 阶关系
Figure 32.6: A relation of degree 4
有人可能会问:如果列是用相应域的名称标记的,为什么列的顺序很重要?如图32.7中的示例所示,两列可能具有相同的标题(表示相同的域),但在关系方面具有不同的含义。所描述的关系称为组件。它是一个三元关系,前两个域称为“部分”,第三个域称为“数量”。组件( x, y, z )的含义是零件x是零件y的直接组件(或子组件),并且需要零件x的z 个单元才能组装零件y的一个单元。这种关系在零件爆炸问题中起着至关重要的作用。
One might ask: If the columns are labeled by the name of corresponding domains, why should the ordering of columns matter? As the example in Figure 32.7 shows, two columns may have identical headings (indicating identical domains) but possess distinct meanings with respect to the relation. The relation depicted is called component. It is a ternary relation, whose first two domains are called part and third domain is called quantity. The meaning of component (x, y, z) is that part x is an immediate component (or subassembly) of part y, and z units of part x are needed to assemble one unit of part y. It is a relation which plays a critical role in the parts explosion problem.
图 32.7: 具有两个相同域的关系
Figure 32.7: A relation with two identical domains
值得注意的是,一些现有的信息系统(主要是基于树结构文件的信息系统)无法为具有两个或多个相同域的关系提供数据表示。当前版本的 IMS/360(IBM,1965b)就是此类系统的一个示例。
It is a remarkable fact that several existing information systems (chiefly those based on tree-structured files) fail to provide data representations for relations which have two or more identical domains. The present version of IMS/360 (IBM, 1965b) is an example of such a system.
数据库中的全部数据可以被视为随时间变化的关系的集合。这些关系有不同的程度。随着时间的推移,每个n元关系可能会插入额外的n元组、删除现有的 n 元组以及更改其任何现有n元组的组件。
The totality of data in a data bank may be viewed as a collection of time-varying relations. These relations are of assorted degrees. As time progresses, each n-ary relation may be subject to insertion of additional n-tuples, deletion of existing ones, and alteration of components of any of its existing n-tuples.
然而,在许多商业、政府和科学数据库中,某些关系的程度相当高(30 的程度并不罕见)。用户通常不应需要记住任何关系的域排序(例如,订购供应商,然后是零件,然后是项目,然后是关系供应中的数量)。因此,我们建议用户处理的不是领域有序的关系,而是领域无序对应的关系。(用数学术语来说,关系是在域排列下等价的那些关系的等价类(参见第32.2.1节)。)为了实现这一点,域必须至少在任何给定关系内是唯一可识别的,而不需要使用位置。因此,在存在两个或多个相同域的情况下,我们在每种情况下都要求域名由独特的角色名称限定,该角色名称用于标识该域在给定关系中所扮演的角色。例如,在图 32.7的关系组件中,第一个域部分可能由角色名称sub限定,第二个域部分由super限定,以便用户可以处理关系组件及其域 - sub.part、super.part、数量——不考虑这些域之间的任何顺序。
In many commercial, governmental, and scientific data banks, however, some of the relations are of quite high degree (a degree of 30 is not at all uncommon). Users should not normally be burdened with remembering the domain ordering of any relation (for example, the ordering supplier, then part, then project, then quantity in the relation supply). Accordingly, we propose that users deal, not with relations which are domain-ordered, but with relationships which are their domain-unordered counterparts. (In mathematical terms, a relationship is an equivalence class of those relations that are equivalent under permutation of domains (see §32.2.1).) To accomplish this, domains must be uniquely identifiable at least within any given relation, without using position. Thus, where there are two or more identical domains, we require in each case that the domain name be qualified by a distinctive role name, which serves to identify the role played by that domain in the given relation. For example, in the relation component of Figure 32.7, the first domain part might be qualified by the role name sub, and the second by super, so that users could deal with the relationship component and its domains—sub.part, super.part, quantity—without regard to any ordering between these domains.
综上所述,建议大多数用户应该与由时变关系(而不是关系)的集合组成的数据关系模型进行交互。每个用户不需要知道更多关于其名称及其域名称的任何关系(必要时角色限定)。(当然,与输入计算机系统和从计算机系统检索的任何数据一样,如果用户了解其含义,通常会更有效地使用这些数据。)甚至该信息也可能由系统以菜单形式提供(受安全和隐私限制)根据用户的要求。
To sum up, it is proposed that most users should interact with a relational model of the data consisting of a collection of time-varying relationships (rather than relations). Each user need not know more about any relationship than its name together with the names of its domains (role qualified whenever necessary). (Naturally, as with any data put into and retrieved from a computer system, the user will normally make far more effective use of the data if he is aware of its meaning.) Even this information might be offered in menu style by the system (subject to security and privacy constraints) upon request by the user.
通常有许多替代方法可以为数据库建立关系模型。为了讨论首选方式(或范式),我们必须首先引入一些附加概念(活动域、主键、外键、非简单域),并与信息系统编程中当前使用的术语建立一些联系。在本文的其余部分,我们不会费心去区分关系和关系,除非明确起来有利。
There are usually many alternative ways in which a relational model may be established for a data bank. In order to discuss a preferred way (or normal form), we must first introduce a few additional concepts (active domain, primary key, foreign key, nonsimple domain) and establish some links with terminology currently in use in information systems programming. In the remainder of this paper, we shall not bother to distinguish between relations and relationships except where it appears advantageous to be explicit.
考虑一个数据库的示例,其中包含有关零件、项目和供应商的关系。在以下域中定义了一种称为“部分”的关系:
Consider an example of a data bank which includes relations concerning parts, projects, and suppliers. One relation called part is defined on the following domains:
1. 零件编号
1. part number
2. 零件名称
2. part name
3. 部分颜色
3. part color
4. 零件重量
4. part weight
5.现有数量
5. quantity on hand
6. 订购数量
6. quantity on order
可能还有其他领域。实际上,这些域中的每一个都是一个值池,其中一些或全部可以随时在数据库中表示。虽然可以想象,在某个时刻,所有零件颜色都存在,但不可能存在所有可能的零件重量、零件名称和零件编号。我们将在某个时刻表示的一组值称为该时刻的活动域。
and possibly other domains as well. Each of these domains is, in effect, a pool of values, some or all of which may be represented in the data bank at any instant. While it is conceivable that, at some instant, all part colors are present, it is unlikely that all possible part weights, part names, and part numbers are. We shall call the set of values represented at some instant the active domain at that instant.
通常,给定关系的一个域(或域的组合)具有唯一标识该关系的每个元素(n元组)的值。这样的域(或组合)称为主键。在上面的示例中,零件编号将是主键,而零件颜色则不是。如果主键是一个简单域(不是组合)或一个组合,使得参与的简单域在唯一标识每个元素时都不是多余的,则主键是非冗余的。一个关系可以拥有多个非冗余主键。如果不同的部分总是被赋予不同的名称,则示例中的情况就是这种情况。每当一个关系有两个或多个非冗余主键时,任意选择其中一个并称为该关系的主键。
Normally, one domain (or combination of domains) of a given relation has values which uniquely identify each element (n-tuple) of that relation. Such a domain (or combination) is called a primary key. In the example above, part number would be a primary key, while part color would not be. A primary key is nonredundant if it is either a simple domain (not a combination) or a combination such that none of the participating simple domains is superfluous in uniquely identifying each element. A relation may possess more than one nonredundant primary key. This would be the case in the example if different parts were always given distinct names. Whenever a relation has two or more nonredundant primary keys, one of them is arbitrarily selected and called the primary key of that relation.
一个常见的要求是关系的元素交叉引用相同关系的其他元素或不同关系的元素。键提供了一种面向用户的方式(但不是唯一的方式)来表达此类交叉引用。如果关系R的域(或域组合)不是R的主键,但其元素是某个关系S的主键的值(不排除S和R相同的可能性),我们将称其为外键)。在图 32.6的供应关系中,供应商、零件、项目的组合是主键,而这三个域中的每一个分别是外键。
A common requirement is for elements of a relation to cross-reference other elements of the same relation or elements of a different relation. Keys provide a user-oriented means (but not the only means) of expressing such cross-references. We shall call a domain (or domain combination) of relation R a foreign key if it is not the primary key of R but its elements are values of the primary key of some relation S (the possibility that S and R are identical is not excluded). In the relation supply of Figure 32.6, the combination of supplier, part, project is the primary key, while each of these three domains taken separately is a foreign key.
在以前的工作中,有一种强烈的倾向,即将数据库中的数据视为由两部分组成,一部分由实体描述(例如供应商的描述)组成,另一部分由各种实体或类型之间的关系组成实体的数量(例如,供应关系)。当一个人可能在任何关系中拥有外键时,这种区别就很难维持。在用户的关系模型中,进行这种区分似乎没有任何优势(但是,当将关系概念应用于用户关系集的机器表示时,可能会有一些优势)。
In previous work there has been a strong tendency to treat the data in a data bank as consisting of two parts, one part consisting of entity descriptions (for example, descriptions of suppliers) and the other part consisting of relations between the various entities or types of entities (for example, the supply relation). This distinction is difficult to maintain when one may have foreign keys in any relation whatsoever. In the user’s relational model there appears to be no advantage to making such a distinction (there may be some advantage, however, when one applies relational concepts to machine representations of the user’s set of relationships).
到目前为止,我们已经讨论了在简单域(其元素是原子(不可分解)值的域)上定义的关系的示例。非原子值可以在关系框架内讨论。因此,某些域可能具有作为元素的关系。这些关系又可以在非简单域上定义,等等。例如,定义关系员工的域之一可能是工资历史记录。工资历史域的元素是在域date和域salary上定义的二元关系。工资历史域是所有此类二元关系的集合。在任何时刻,数据库中的工资历史关系实例的数量与雇员的数量一样多。相反,雇员关系只有一个实例。
So far, we have discussed examples of relations which are defined on simple domains—domains whose elements are atomic (nondecomposable) values. Nonatomic values can be discussed within the relational framework. Thus, some domains may have relations as elements. These relations may, in turn, be defined on nonsimple domains, and so on. For example, one of the domains on which the relation employee is defined might be salary history. An element of the salary history domain is a binary relation defined on the domain date and the domain salary. The salary history domain is the set of all such binary relations. At any instant of time there are as many instances of the salary history relation in the data bank as there are employees. In contrast, there is only one instance of the employee relation.
当前数据库术语中的术语“属性”和“重复组”分别大致类似于简单域和非简单域。目前的很多困惑术语是由于未能区分类型和实例(如“记录”),以及一方面数据的用户模型的组件与另一方面其机器表示对应部分之间的差异(再次,我们将“记录”引用为一个例子)。
The terms attribute and repeating group in present data base terminology are roughly analogous to simple domain and nonsimple domain, respectively. Much of the confusion in present terminology is due to failure to distinguish between type and instance (as in “record”) and between components of a user model of the data on the one hand and their machine representation counterparts on the other hand (again, we cite “record” as an example).
例如,考虑图 32.8中展示的关系集合。工作经历和孩子是关系员工的非简单领域。薪资历史是关系工作历史的一个非简单域。图 32.8中的树仅显示了非简单域的这些相互关系。
Consider, for example, the collection of relations exhibited in Figure 32.8. Job history and children are nonsimple domains of the relation employee. Salary history is a nonsimple domain of the relation job history. The tree in Figure 32.8 shows just these interrelationships of the nonsimple domains.
图 32.8: 非标准化集
Figure 32.8: Unnormalized set
标准化进行如下。从树顶部的关系开始,获取其主键并通过插入此主键域或域组合来扩展每个直接从属关系。每个扩展关系的主键由扩展前的主键加上从父关系复制下来的主键组成。现在,从父关系中删除所有非简单域,删除树的顶部节点,并对每个剩余的子树重复相同的操作序列。
Normalization proceeds as follows. Starting with the relation at the top of the tree, take its primary key and expand each of the immediately subordinate relations by inserting this primary key domain or domain combination. The primary key of each expanded relation consists of the primary key before expansion augmented by the primary key copied down from the parent relation. Now, strike out from the parent relation all nonsimple domains, remove the top node of the tree, and repeat the same sequence of operations on each remaining subtree.
对图32.8中的关系集合进行归一化的结果是图32.9中的集合。每个关系的主键都以斜体显示,以显示这些键如何通过规范化进行扩展。
The result of normalizing the collection of relations in Figure 32.8 is the collection in Figure 32.9. The primary key of each relation is italicized to show how such keys are expanded by the normalization.
图 32.9: 归一化集
Figure 32.9: Normalized set
如果要应用上述规范化,则非规范化关系集合必须满足以下条件:
If normalization as described above is to be applicable, the unnormalized collection of relations must satisfy the following conditions:
1. 非简单域的相互关系图是树的集合。
1. The graph of interrelationships of the nonsimple domains is a collection of trees.
2. 没有一个主键有一个不简单的组成域。
2. No primary key has a component domain which is nonsimple.
作者知道没有任何应用程序需要放宽这些条件。进一步的标准化操作是可能的。这些在本文中不予讨论。
The writer knows of no application which would require any relaxation of these conditions. Further operations of a normalizing kind are possible. These are not discussed in this paper.
当所有关系都以规范形式转换时,数组表示的简单性变得可行,这不仅有利于存储目的,而且有利于使用广泛不同的数据表示的系统之间的批量数据通信。通信形式将是数组表示的适当压缩版本,并且具有以下优点:
The simplicity of the array representation which becomes feasible when all relations are cast in normal form is not only an advantage for storage purposes but also for communication of bulk data between systems which use widely different representations of the data. The communication form would be a suitably compressed version of the array representation and would have the following advantages:
1. 它将没有指针(地址值或位移值)。
1. It would be devoid of pointers (address-valued or displacement-valued).
2. 它将避免对哈希寻址方案的所有依赖。
2. It would avoid all dependence on hash addressing schemes.
3. 它将不包含索引或排序列表。
3. It would contain no indices or ordering lists.
如果用户的关系模型以正常形式建立,则数据库中的数据项的名称可以采用比其他情况更简单的形式。通用名称的形式如下
If the user’s relational model is set up in normal form, names of items of data in the data bank can take a simpler form than would otherwise be the case. A general name would take a form such as
其中R是关系名称;g是一代标识符(可选);r是角色名称(可选);d是域名。由于仅当给定关系的几代存在或预期存在时才需要g ,并且仅当关系R具有两个或多个名为d的域时才需要r,因此简单形式Rd通常就足够了。
where R is a relational name; g is a generation identifier (optional); r is a role name (optional); d is a domain name. Since g is needed only when several generations of a given relation exist, or are anticipated to exist, and r is needed only when the relation R has two or more domains named d, the simple form R.d will often be adequate.
让我们用R表示数据子语言,用H表示宿主语言。R允许声明关系及其域。关系的每个声明都标识该关系的主键。声明的关系被添加到系统目录中,供具有适当授权的用户社区的任何成员使用。H允许支持声明,这些声明可能不太永久地指示这些关系如何在存储中表示。R允许指定从数据库检索任何数据子集的规范。对此类检索请求的操作受到安全限制。
Let us denote the data sublanguage by R and the host language by H. R permits the declaration of relations and their domains. Each declaration of a relation identifies the primary key for that relation. Declared relations are added to the system catalog for use by any members of the user community who have appropriate authorization. H permits supporting declarations which indicate, perhaps less permanently, how these relations are represented in storage. R permits the specification for retrieval of any subset of data from the data bank. Action on such a retrieval request is subject to security constraints.
数据子语言的通用性在于其描述能力(而不是其计算能力)。在大型数据库中,即使我们假设(正如我们所做的那样)只有一组有限的函数子例程可供系统访问使用,每个数据子集都有大量可能的(且合理的)描述符合检索条件的数据。因此,可以在集合规范中使用的限定表达式类必须具有描述性应用谓词演算的格式良好的公式类的幂。众所周知,为了保留这种描述能力,没有必要表达(无论选择什么语法)所选谓词演算的每个公式。例如,仅那些 prenex 范式就足够了(Church,1956)。
The universality of the data sublanguage lies in its descriptive ability (not its computing ability). In a large data bank each subset of the data has a very large number of possible (and sensible) descriptions, even when we assume (as we do) that there is only a finite set of function subroutines to which the system has access for use in qualifying data for retrieval. Thus, the class of qualification expressions which can be used in a set specification must have the descriptive power of the class of well-formed formulas of an applied predicate calculus. It is well known that to preserve this descriptive power it is unnecessary to express (in whatever syntax is chosen) every formula of the selected predicate calculus. For example, just those in prenex normal form are adequate (Church, 1956).
检索语句的限定或其他部分可能需要算术函数。此类函数可以在H中定义并在R中调用。
Arithmetic functions may be needed in the qualification or other parts of retrieval statements. Such functions can be defined in H and invoked in R.
如此指定的集合可以仅出于查询目的而被获取,或者可以被保留以用于可能的改变。插入采取向声明的关系添加新元素的形式,而不考虑其机器表示中可能存在的任何顺序。对社区(相对于个人用户或子社区)有效的删除采取从声明的关系中删除元素的形式。如果在R中声明了指定关系之间的删除和更新依赖关系,则某些删除和更新可能会被其他删除和更新触发。
A set so specified may be fetched for query purposes only, or it may be held for possible changes. Insertions take the form of adding new elements to declared relations without regard to any ordering that may be present in their machine representation. Deletions which are effective for the community (as opposed to the individual user or sub-communities) take the form of removing elements from declared relations. Some deletions and updates may be triggered by others, if deletion and update dependencies between specified relations are declared in R.
对数据采用的视图对用于检索数据的语言的一个重要影响是数据元素和集合的命名。上一节已经讨论了这方面的一些方面。使用通常的网络视图,用户通常会因创造和使用比绝对必要的更多的关系名称而感到负担,因为名称与路径(或路径类型)而不是与关系相关联。
One important effect that the view adopted toward data has on the language used to retrieve it is in the naming of data elements and sets. Some aspects of this have been discussed in the previous section. With the usual network view, users will often be burdened with coining and using more relation names than are absolutely necessary, since names are associated with paths (or path types) rather than with relations.
一旦用户意识到存储了某种关系,他将期望能够使用其参数作为“已知”和其余参数作为“未知”的任意组合来利用它,因为信息(如珠穆朗玛峰)就在那里。这是一个系统特征(许多当前的信息系统中缺少),我们将其称为(逻辑上)关系的对称利用。当然,性能的对称性是不可预期的。
Once a user is aware that a certain relation is stored, he will expect to be able to exploit it using any combination of its arguments as “knowns” and the remaining arguments as “unknowns,” because the information (like Everest) is there. This is a system feature (missing from many current information systems) which we shall call (logically) symmetric exploitation of relations. Naturally, symmetry in performance is not to be expected.
为了支持单个二元关系的对称利用,需要两个有向路径。对于n度关系,要命名和控制的路径数量是n阶乘。
To support symmetric exploitation of a single binary relation, two directed paths are needed. For a relation of degree n, the number of paths to be named and controlled is n factorial.
同样,如果采用关系视图,其中每个n元关系 ( n > 2) 必须由用户表示为仅涉及二元关系的嵌套表达式(例如,参见 Feldman 的 LEAP 系统 [Feldman 和 Rovner,1968]) ) 那么必须创建2 n − 1 个名称,而不是仅使用直接n元表示法创建n + 1 个名称,如第32.1.2节中所述。例如,图 32.6的 4 元关系供给,它需要n元表示法中的 5 个名称,将以以下形式表示
Again, if a relational view is adopted in which every n-ary relation (n > 2) has to be expressed by the user as a nested expression involving only binary relations (see Feldman’s LEAP System [Feldman and Rovner, 1968], for example) then 2n − 1 names have to be coined instead of only n + 1 with direct n-ary notation as described in §32.1.2. For example, the 4-ary relation supply of Figure 32.6, which entails 5 names in n-ary notation, would be represented in the form
采用嵌套二进制表示法,因此使用 7 个名称。
in nested binary notation and, thus, employ 7 names.
这种表达的另一个缺点是它的不对称性。尽管这种不对称性并不禁止对称利用,但它确实使用户表达某些询问基础变得非常困难(例如,考虑通过Q和R查询与某些给定项目相关的那些零件和数量)。
A further disadvantage of this kind of expression is its asymmetry. Although this asymmetry does not prohibit symmetric exploitation, it certainly makes some bases of interrogation very awkward for the user to express (consider, for example, a query for those parts and quantities related to certain given projects via Q and R).
可表达集合是可以用数据语言中的表达式指定的关系的总集合。这些表达式是根据命名集中关系的简单名称构造的;世代、角色和领域的名称;逻辑连接词;谓词演算的量词;以及某些常量关系符号,例如=、>。命名集是可表达集的子集——通常是一个非常小的子集。
The expressible set is the total collection of relations that can be designated by expressions in the data language. Such expressions are constructed from simple names of relations in the named set; names of generations, roles and domains; logical connectives; the quantifiers of the predicate calculus; and certain constant relation symbols such as =, >. The named set is a subset of the expressible set—usually a very small subset.
由于命名集中的某些关系可能是该集中其他关系的与时间无关的组合,因此考虑将命名集与定义这些与时间无关的约束的语句集合相关联是有用的。我们将推迟对此的进一步讨论,直到我们引入了几种关系操作(参见第32.2节)。
Since some relations in the named set may be time-independent combinations of others in that set, it is useful to consider associating with the named set a collection of statements that define these time-independent constraints. We shall postpone further discussion of this until we have introduced several operations on relations (see §32.2).
要为其用户支持关系模型的数据系统的设计者面临的主要问题之一是确定要支持的存储表示的类别。理想情况下,允许的数据表示形式的多样性应足以覆盖整个安装集合的性能要求范围。太多的多样性会导致不必要的存储开销以及对当前有效结构的描述的不断重新解释。
One of the major problems confronting the designer of a data system which is to support a relational model for its users is that of determining the class of stored representations to be supported. Ideally, the variety of permitted data representations should be just adequate to cover the spectrum of performance requirements of the total collection of installations. Too great a variety leads to unnecessary overhead in storage and continual reinterpretation of descriptions for the structures currently in effect.
对于任何选定的存储表示类别,数据系统必须提供一种方法,将以关系模型的数据语言表达的用户请求转换为对当前存储表示的相应且有效的操作。对于高级数据语言来说,这提出了一个具有挑战性的设计问题。然而,这是一个必须解决的问题——随着越来越多的用户获得对大型数据库的并发访问,提供有效响应和吞吐量的责任从单个用户转移到数据系统。
For any selected class of stored representations the data system must provide a means of translating user requests expressed in the data language of the relational model into corresponding—and efficient—actions on the current stored representation. For a high level data language this presents a challenging design problem. Nevertheless, it is a problem which must be solved—as more users obtain concurrent access to a large data bank, responsibility for providing efficient response and throughput shifts from the individual user to the data system.
下面讨论的操作专门针对关系。引入这些操作是因为它们在从其他关系派生关系方面发挥着关键作用。它们的主要应用是在非推理信息系统(不提供逻辑推理服务的系统)中,尽管添加此类服务时它们的适用性不一定会被破坏。
The operations discussed below are specifically for relations. These operations are introduced because of their key role in deriving relations from other relations. Their principal application is in noninferential information systems—systems which do not provide logical inference services—although their applicability is not necessarily destroyed when such services are added.
大多数用户不会直接关心这些操作。然而,信息系统设计者和与数据库控制有关的人员应该完全熟悉它们。
Most users would not be directly concerned with these operations. Information systems designers and people concerned with data bank control should, however, be thoroughly familiar with them.
32.2.1.1 排列 二元关系具有两列的数组表示形式。交换这些列会产生相反的关系。更一般地,如果将排列应用于n元关系的列,则所得关系被称为给定的排列关系。例如有4个!=图 32.6中关系供应的 24 种排列,如果我们包括使列的排序保持不变的恒等排列。
32.2.1.1 Permutation A binary relation has an array representation with two columns. Interchanging these columns yields the converse relation. More generally, if a permutation is applied to the columns of an n-ary relation, the resulting relation is said to be a permutation of the given relation. There are, for example, 4! = 24 permutations of the relation supply in Figure 32.6, if we include the identity permutation which leaves the ordering of columns unchanged.
由于用户的关系模型由关系的集合(域无序关系)组成,因此排列与孤立考虑的这种模型无关。然而,它与模型的存储表示的考虑相关。在提供关系的对称利用的系统中,由存储的关系负责的查询集与由该关系的任何排列负责的查询集相同。尽管从逻辑上讲没有必要同时存储关系及其某些排列,但出于性能考虑,这样做是明智的。
Since the user’s relational model consists of a collection of relationships (domain-unordered relations), permutation is not relevant to such a model considered in isolation. It is, however, relevant to the consideration of stored representations of the model. In a system which provides symmetric exploitation of relations, the set of queries answerable by a stored relation is identical to the set answerable by any permutation of that relation. Although it is logically unnecessary to store both a relation and some permutation of it, performance considerations could make it advisable.
32.2.1.2 投影 假设现在我们选择关系的某些列(删除其他列),然后从结果数组中删除行中的任何重复项。最终的数组表示一个关系,该关系被称为给定关系的投影。
32.2.1.2 Projection Suppose now we select certain columns of a relation (striking out the others) and then remove from the resulting array any duplication in the rows. The final array represents a relation which is said to be a projection of the given relation.
选择运算符π用于获得任何所需的排列、投影或两个运算的组合。因此,如果L是索引列表L = i 1 , i 2 , … , i k并且R是n元关系 ( n ≥ k ),则π L ( R ) 是第j个的k元关系列是R ( j = 1, 2, … , k )的第i j列,只是删除了结果行中的重复项。考虑图 32.6的供应关系。这种关系的排列投影如图 32.10所示。请注意,在这种特殊情况下,投影的 n 元组数量少于从中导出它的关系的n元组数量。
A selection operator π is used to obtain any desired permutation, projection, or combination of the two operations. Thus, if L is a list of indices L = i1, i2, …, ik and R is an n-ary relation (n ≥ k), then πL(R) is the k-ary relation whose jth column is column ij of R (j = 1, 2, …, k) except that duplication in resulting rows is removed. Consider the relation supply of Figure 32.6. A permuted projection of this relation is exhibited in Figure 32.10. Note that, in this particular case, the projection has fewer n-tuples than the relation from which it is derived.
Figure 32.10: A permuted projection of the relation in Figure 32.6
32.2.1.3 连接 假设我们有两个二元关系,它们有一些共同的域。在什么情况下我们可以将这些关系组合起来形成保留给定关系中所有信息的三元关系?
32.2.1.3 Join Suppose we are given two binary relations, which have some domain in common. Under what circumstances can we combine these relations to form a ternary relation which preserves all of the information in the given relations?
图 32.11中的示例显示了两个关系R、S,它们可以连接而不会丢失信息,而图 32.12显示了R与S的连接。如果存在三元关系U使得π 12 ( U ) = R且π 23 ( U ) = S ,则二元关系R可与二元关系 S连接。任何此类三元关系称为R与S的连接。如果R、S是二元关系,使得π 2 ( R ) = π 1 ( S ),则R可与S连接。在这种情况下始终存在的一种连接是R与S的自然连接,定义为
The example in Figure 32.11 shows two relations R, S, which are joinable without loss of information, while Figure 32.12 shows a join of R with S. A binary relation R is joinable with a binary relation S if there exists a ternary relation U such that π12(U) = R and π23(U) = S. Any such ternary relation is called a join of R with S. If R, S are binary relations such that π2(R) = π1(S), then R is joinable with S. One join that always exists in such a case is the natural join of R with S defined by
图 32.11: 两个可连接的关系
Figure 32.11: Two joinable relations
Figure 32.12: The natural join of R with S (from Figure 32.11)
其中,如果( a, b ) 是R的成员,则R ( a, b ) 的值为true,对于S ( b, c ) 也类似。立即的是
where R(a, b) has the value true if (a, b) is a member of R and similarly for S(b, c). It is immediate that
和
and
请注意,图 32.12中所示的连接是图 32.11中R与S的自然连接。另一个连接如图 32.13所示。
Note that the join shown in Figure 32.12 is the natural join of R with S from Figure 32.11. Another join is shown in Figure 32.13.
Figure 32.13: Another join of R with S (from Figure 32.11)
对这些关系的检查揭示了域部分(要进行连接的域)的一个元素(元素 1),该元素具有在R下拥有多个相关属性的属性 也在S下。正是这个元素产生了多个连接。连接域中的此类元素称为R与S连接的歧义点。
Inspection of these relations reveals an element (element 1) of the domain part (the domain on which the join is to be made) with the property that it possesses more than one relative under R and also under S. It is this element which gives rise to the plurality of joins. Such an element in the joining domain is called a point of ambiguity with respect to the joining of R with S.
如果π 21 ( R ) 或S是函数,则将R与S连接时不会出现任何歧义。在这种情况下,R与S的自然连接是R与S的唯一连接。请注意,重申“ R与S ”的限定是必要的,因为S可能可以与R(以及R与S)连接,并且这种连接将是完全单独的考虑因素。在图 32.11中,关系R、π 21 ( R ) 、S、π 21 ( S ) 都不是函数。……
If either π21(R) or S is a function, no point of ambiguity can occur in joining R with S. In such a case, the natural join of R with S is the only join of R with S. Note that the reiterated qualification “of R with S” is necessary, because S might be joinable with R (as well as R with S), and this join would be an entirely separate consideration. In Figure 32.11, none of the relations R, π21(R), S, π21(S) is a function. …
在第32.2节中,定义了关系操作和两种类型的冗余,并将其应用于维护数据一致状态的问题。随着越来越多不同类型的数据被集成到公共数据库中,这必将成为一个严重的实际问题。……
In §32.2 operations on relations and two types of redundancy are defined and applied to the problem of maintaining the data in a consistent state. This is bound to become a serious practical problem as more and more different types of data are integrated together into common data banks. …
经计算机协会许可,转载自 Codd (1970)。
Reprinted from Codd (1970), with permission from the Association for Computing Machinery.
温斯顿·罗伊斯(Winston Royce,1929-1995 年)是一名接受过训练的航空工程师和一名实践中的软件工程师。从 1961 年到 1994 年,他先是在 TRW,然后在洛克希德公司领导了航空航天工业的软件开发项目。与 Fred Brooks(第 40 章)一样,他利用自己开发大型系统的经验来制定有关如何改进流程的实用指南。本文总结了他在后来的著作中阐述的关键思想。
Winston Royce (1929–1995) was an aeronautical engineer by training and a software engineer in practice. From 1961 until 1994 he led software development projects in the aerospace industry, first at TRW and then at Lockheed. Like Fred Brooks (chapter 40), he drew on his experience developing large systems to develop practical guidance on how the process can be improved. This paper summarizes key ideas which he elaborated in later works.
罗伊斯的思想仍然具有影响力。图 33.1和33.2产生了软件开发术语“瀑布模型”,指的是一种协议,要求每个开发阶段完成后才能开始下一阶段,没有备份,后期发现的任何问题都归因于执行失败的早期阶段。然而,罗伊斯本人从未使用过“瀑布”一词,该论文清楚地描述了今天所谓的“迭代”或“敏捷”开发方法的元素,其中早期的实现是临时的,用于完善设计。
Royce’s ideas remain influential. Figures 33.1 and 33.2 gave rise to the term “waterfall model” for software development, referring to a protocol requiring each development stage to be completed before the next stage is begun, with no backing up and any problems discovered in later stages attributed to failed execution of earlier stages. However, Royce himself never uses the term “waterfall,” and the paper plainly describes elements of what would today be called “iterative” or “agile” development methods, in which early implementations are provisional and used in order to refine the design.
图 33.1: 为内部操作提供小型计算机程序的实施步骤。
Figure 33.1: Implementation steps to deliver a small computer program for internal operations.
图 33.2: 开发大型计算机程序以交付给客户的实施步骤。
Figure 33.2: Implementation steps to develop a large computer program for delivery to a customer.
Royce 不再强调编码,而是强调设计、分析和文档;这些想法与 Dijkstra 坚持在编写任何代码之前对程序进行逻辑和彻底思考的观点相呼应。罗伊斯没有迪杰斯特拉那样的数学僵化;作为负责交付用于关键航空航天系统的庞大系统的人,他不可能坚持将简单和优雅作为最高价值观。他对彻底测试的承诺不会给 Dijkstra 留下深刻的印象,他正确地观察到,测试只能揭示错误,而不能确定正确性。然而这篇论文充满了有用的智慧,它已经通过了时间的考验,至今仍然是很好的建议。
Royce de-emphasizes coding and stresses design, analysis, and documentation; these ideas are resonant with Dijkstra’s insistence on thinking logically and thoroughly about a program before writing any code. Royce has none of Dijkstra’s mathematical stiffness; as the person responsible for delivering enormous systems to be used in critical aeronautical and aerospace systems, he could not have insisted on simplicity and elegance as the highest values. His commitment to thorough testing would not have impressed Dijkstra, who observed, correctly, that testing can only reveal errors, not establish correctness. Yet this paper is so full of useful wisdom that it has passed the test of time and remains good advice today.
我将描述我对管理大型软件开发的个人看法。在过去的九年里,我承担了各种任务,主要涉及航天器任务规划、指挥和飞行后分析软件包的开发。在这些任务中,我在按时且在成本范围内达到运行状态方面取得了不同程度的成功。我因自己的经历而产生了偏见,我将在本次演讲中阐述其中一些偏见。
I am going to describe my personal views about managing large software developments. I have had various assignments during the past nine years, mostly concerned with the development of software packages for spacecraft mission planning, commanding and post-flight analysis. In these assignments I have experienced different degrees of success with respect to arriving at an operational state, on-time, and within costs. I have become prejudiced by my experiences and I am going to relate some of these prejudices in this presentation.
无论大小或复杂性如何,所有计算机程序开发都有两个共同的基本步骤。首先是分析步骤,然后是编码步骤,如图33.1所示。事实上,如果工作量足够小,并且最终产品由构建它的人来操作(就像通常使用内部使用的计算机程序所做的那样),那么这种非常简单的实现概念实际上就是所需要的。这也是大多数客户乐意支付的开发工作,因为这两个步骤都涉及真正的创造性工作,直接有助于最终产品的实用性。然而,制造更大软件系统的实施计划并且仅以这些步骤为关键,注定会失败。需要许多额外的开发步骤,没有一个步骤像分析和编码那样直接对最终产品做出贡献,并且所有步骤都会增加开发成本。客户人员通常不愿意支付费用,开发人员也不愿意实施它们。管理层的主要职能是将这些概念推销给两个团队,然后强制开发人员遵守。
There are two essential steps common to all computer program developments, regardless of size or complexity. There is first an analysis step, followed second by a coding step as depicted in Figure 33.1. This sort of very simple implementation concept is in fact all that is required if the effort is sufficiently small and if the final product is to be operated by those who built it—as is typically done with computer programs for internal use. It is also the kind of development effort for which most customers are happy to pay, since both steps involve genuinely creative work which directly contributes to the usefulness of the final product. An implementation plan to manufacture larger software systems, and keyed only to these steps, however, is doomed to failure. Many additional development steps are required, none contribute as directly to the final product as analysis and coding, and all drive up the development costs. Customer personnel typically would rather not pay for them, and development personnel would rather not implement them. The prime function of management is to sell these concepts to both groups and then enforce compliance on the part of development personnel.
图 33.2展示了一种更宏大的软件开发方法。分析和编码步骤仍在图中,但它们之前是两级需求分析,由程序设计步骤分隔开,然后是测试步骤。这些添加与分析和编码分开处理,因为它们的执行方式明显不同。必须对它们进行不同的规划和人员配备,以便最好地利用计划资源。
A more grandiose approach to software development is illustrated in Figure 33.2. The analysis and coding steps are still in the picture, but they are preceded by two levels of requirements analysis, are separated by a program design step, and followed by a testing step. These additions are treated separately from analysis and coding because they are distinctly different in the way they are executed. They must be planned and staffed differently for best utilization of program resources.
图 33.3描绘了该方案的连续开发阶段之间的迭代关系。步骤的排序基于以下概念:随着每个步骤的进展和设计的进一步详细化,前面和后面的步骤会进行迭代,但很少会出现序列中较远的步骤。所有这一切的优点是,随着设计的进行,变更过程的范围缩小到可管理的限度。需求分析完成后,在设计过程中的任何一点,都存在一个牢固且特写的移动基线,如果出现不可预见的设计困难,可以返回到该基线。我们拥有的是一个有效的后备位置,它倾向于最大限度地扩大早期工作的可挽救和保存范围。
Figure 33.3 portrays the iterative relationship between successive development phases for this scheme. The ordering of steps is based on the following concept: that as each step progresses and the design is further detailed, there is an iteration with the preceding and succeeding steps but rarely with the more remote steps in the sequence. The virtue of all of this is that as the design proceeds the change process is scoped down to manageable limits. At any point in the design process after the requirements analysis is completed there exists a firm and closeup moving baseline to which to return in the event of unforeseen design difficulties. What we have is an effective fallback position that tends to maximize the extent of early work that is salvageable and preserved.
图 33.3: 希望各个阶段之间的迭代交互仅限于连续的步骤。
Figure 33.3: Hopefully, the iterative interaction between the various phases is confined to successive steps.
我相信这个概念,但上述实施是有风险的,并且会导致失败。该问题如图 33.4所示。在开发周期结束时发生的测试阶段是第一个事件,在该阶段,时序、存储、输入/输出传输等的体验与分析有所不同。这些现象无法精确分析。例如,它们不是数学物理标准偏微分方程的解。然而,如果这些现象不能满足各种外部约束,那么必然会出现需要进行重大重新设计。简单的八进制补丁或重做一些孤立的代码并不能解决这些困难。所需的设计更改可能会具有很大的破坏性,以至于违反了设计所依据的软件要求以及为一切提供基本原理的软件要求。要么必须修改要求,要么需要对设计进行实质性更改。事实上,开发过程已经回到了原点,预计进度和/或成本将超支 100%。
I believe in this concept, but the implementation described above is risky and invites failure. The problem is illustrated in Figure 33.4. The testing phase which occurs at the end of the development cycle is the first event for which timing, storage, input/output transfers, etc., are experienced as distinguished from analyzed. These phenomena are not precisely analyzable. They are not the solutions to the standard partial differential equations of mathematical physics for instance. Yet if these phenomena fail to satisfy the various external constraints, then invariably a major redesign is required. A simple octal patch or redo of some isolated code will not fix these kinds of difficulties. The required design changes are likely to be so disruptive that the software requirements upon which the design is based and which provides the rationale for everything are violated. Either the requirements must be modified, or a substantial change in the design is required. In effect the development process has returned to the origin and one can expect up to a 100-percent overrun in schedule and/or costs.
图 33.4: 不幸的是,对于所示的过程,设计迭代从来不限于连续的步骤。
Figure 33.4: Unfortunately, for the process illustrated, the design iterations are never confined to the successive steps.
人们可能会注意到,分析和代码阶段被跳过了。当然,如果没有这些步骤,就无法生产软件,但通常这些阶段的管理相对容易,并且对需求、设计和测试影响很小。根据我的经验,整个部门都致力于轨道力学分析、航天器姿态确定、有效载荷活动的数学优化等,但是当这些部门完成其困难而复杂的工作时,最终的程序步骤涉及几行串行代码。算术代码。如果分析师在执行困难而复杂的工作时犯了错误,那么纠正总是通过代码中的微小更改来实现,而不会破坏其他开发基地的反馈。
One might note that there has been a skipping-over of the analysis and code phases. One cannot, of course, produce software without these steps, but generally these phases are managed with relative ease and have little impact on requirements, design, and testing. In my experience there are whole departments consumed with the analysis of orbit mechanics, spacecraft attitude determination, mathematical optimization of payload activity and so forth, but when these departments have completed their difficult and complex work, the resultant program steps involve a few lines of serial arithmetic code. If in the execution of their difficult and complex work the analysts have made a mistake, the correction is invariably implemented by a minor change in the code with no disruptive feedback into the other development bases.
然而,我相信所阐述的方法从根本上来说是合理的。本讨论的其余部分介绍了必须添加到此基本方法中的五个附加功能,以消除大部分开发风险。
However, I believe the illustrated approach to be fundamentally sound. The remainder of this discussion presents five additional features that must be added to this basic approach to eliminate most of the development risks.
修复的第一步如图 33.5所示。在软件需求生成阶段和分析阶段之间插入了初步程序设计阶段。这个过程可能会受到批评,因为程序设计者被迫在初始软件需求的相对真空中进行设计,而没有任何现有的分析。因此,如果他等到分析完成,他的初步设计与他的设计相比可能存在很大的错误。这种批评是正确的,但没有抓住重点。通过这种技术,程序设计者可以确保软件不会因为存储、定时和数据流量原因而失败。随着分析在后续阶段的进行,程序设计者必须对分析者施加存储、计时和操作约束,以便他能够感知结果。当他有理由需要更多此类资源来实现他的方程时,必须同时从他的分析师同胞那里夺取这些资源。通过这种方式,所有分析师和所有程序设计人员都将为有意义的设计过程做出贡献,最终实现执行时间和存储资源的正确分配。如果要应用的总资源不足,或者如果胚胎操作设计是错误的,那么将在早期阶段识别出来,并且可以在最终设计、编码和测试开始之前重做需求迭代和初步设计。这个程序是如何实施的?需要执行以下步骤。
The first step towards a fix is illustrated in Figure 33.5. A preliminary program design phase has been inserted between the software requirements generation phase and the analysis phase. This procedure can be criticized on the basis that the program designer is forced to design in the relative vacuum of initial software requirements without any existing analysis. As a result, his preliminary design may be substantially in error as compared to his design if he were to wait until the analysis was complete. This criticism is correct but it misses the point. By this technique the program designer assures that the software will not fail because of storage, timing, and data flux reasons. As the analysis proceeds in the succeeding phase the program designer must impose on the analyst the storage, timing, and operational constraints in such a way that he senses the consequences. When he justifiably requires more of this kind of resource in order to implement his equations it must be simultaneously snatched from his analyst compatriots. In this way all the analysts and all the program designers will contribute to a meaningful design process which will culminate in the proper allocation of execution time and storage resources. If the total resources to be applied are insufficient or if the embryo operational design is wrong it will be recognized at this earlier stage and the iteration with requirements and preliminary design can be redone before final design, coding and test commences. How is this procedure implemented? The following steps are required.
1. 与程序设计者一起开始设计过程,而不是分析师或程序员。
1. Begin the design process with program designers, not analysts or programmers.
2. 设计、定义和分配数据处理模式,即使冒着错误的风险。分配处理、功能,设计数据库,定义数据库处理,分配执行时间,定义与操作系统的接口和处理模式,描述输入和输出处理,并定义初步操作程序。
2. Design, define and allocate the data processing modes even at the risk of being wrong. Allocate processing, functions, design the data base, define data base processing, allocate execution time, define interfaces and processing modes with the operating system, describe input and output processing, and define preliminary operating procedures.
3. 撰写一份易于理解、信息丰富且最新的概述文档。每个工人都必须对系统有基本的了解。至少一个人必须对系统有深入的了解,这部分来自于必须编写一份概述文档。
3. Write an overview document that is understandable, informative and current. Each and every worker must have an elemental understanding of the system. At least one person must have a deep understanding of the system which comes partially from having had to write an overview document.
图 33.5: 步骤 1:确保在分析开始之前完成初步程序设计。
Figure 33.5: Step 1: Insure that a preliminary program design is complete before analysis begins.
此时应该提出“多少文档?”的问题。我自己的看法是“相当多”;如果留给他们自己的设备,这肯定比大多数程序员、分析师或程序设计者愿意做的更多。管理软件开发的第一条规则是严格执行文档要求。
At this point it is appropriate to raise the issue of—“how much documentation?” My own view is “quite a lot”; certainly more than most programmers, analysts, or program designers are willing to do if left to their own devices. The first rule of managing software development is ruthless enforcement of documentation requirements.
有时我会被要求审查其他软件设计工作的进度。我的第一步是调查文档的状态。如果文档严重默认,我的第一个建议很简单。取代项目管理。停止所有与文档无关的活动。使文档达到可接受的标准。如果没有高度的文档化,软件管理根本不可能实现。举个例子,让我提供以下估计进行比较。为了采购价值 500 万美元的硬件设备,我希望 30 页的规格能够提供足够的细节来控制采购。为了采购价值 500 万美元的软件,我估计 1500 页的规格就可以达到类似的控制效果。
Occasionally I am called upon to review the progress of other software design efforts. My first step is to investigate the state of the documentation. If the documentation is in serious default my first recommendation is simple. Replace project management. Stop all activities not related to documentation. Bring the documentation up to acceptable standards. Management of software is simply impossible without a very high degree of documentation. As an example, let me offer the following estimates for comparison. In order to procure a 5 million dollar hardware device, I would expect that a 30 page specification would provide adequate detail to control the procurement. In order to procure 5 million dollars of software I would estimate a 1500 page specification is about right in order to achieve comparable control.
为什么要这么多文档?
Why so much documentation?
1. 每个设计师必须与界面设计师、他的管理层以及可能的客户进行沟通。口头记录太无形,无法为界面或管理决策提供充分的基础。可接受的书面描述迫使设计师采取明确的立场并提供切实的完成证据。它可以防止设计师月复一月地躲在“我已经完成了 90%”综合症背后。
1. Each designer must communicate with interfacing designers, with his management and possibly with the customer. A verbal record is too intangible to provide an adequate basis for an interface or management decision. An acceptable written description forces the designer to take an unequivocal position and provide tangible evidence of completion. It prevents the designer from hiding behind the “I am 90-percent finished”-syndrome month after month.
2. 在软件开发的早期阶段,文档就是规范,就是设计。在编码开始之前,这三个名词(文档、规范、设计)表示同一个事物。如果文档不好,那么设计就不好。如果文档还不存在,那么还没有设计,只有人们思考和谈论设计,这是有一定价值的,但价值不大。
2. During the early phase of software development the documentation is the specification and is the design. Until coding begins these three nouns (documentation, specification, design) denote a single thing. If the documentation is bad the design is bad. If the documentation does not yet exist there is as yet no design, only people thinking and talking about the design which is of some value, but not much.
3. 良好文档的真正货币价值始于开发过程下游的测试阶段,并持续到运营和重新设计。文档的价值可以用每个项目经理面临的三种具体、有形的情况来描述。
3. The real monetary value of good documentation begins downstream in the development process during the testing phase and continues through operations and redesign. The value of documentation can be described in terms of three concrete, tangible situations that every program manager faces.
(a) 在测试阶段,通过良好的文档,经理可以将人员集中在程序中的错误上。如果没有良好的文档,每个错误,无论大小,都会由一个可能首先犯下错误的人进行分析,因为他是唯一了解程序领域的人。
(a) During the testing phase, with good documentation the manager can concentrate personnel on the mistakes in the program. Without good documentation every mistake, large or small, is analyzed by one man who probably made the mistake in the first place because he is the only man who understands the program area.
(b) 在运营阶段,有了良好的文档,管理者就可以使用面向运营的人员来运营该计划,并以更低的成本做得更好。如果没有良好的文档,软件就必须由构建它的人来操作。一般来说,这些人对运营相对不感兴趣,并且工作效率不如运营人员。在这方面应该指出的是,在操作情况下,如果出现某种挂起,总是首先归咎于软件。为了无论是要为软件开脱,还是要弥补过错,软件文档都必须说得清楚。
(b) During the operational phase, with good documentation the manager can use operation-oriented personnel to operate the program and to do a better job, cheaper. Without good documentation the software must be operated by those who built it. Generally these people are relatively disinterested in operations and do not do as effective a job as operations-oriented personnel. It should be pointed out in this connection that in an operational situation, if there is some hangup the software is always blamed first. In order either to absolve the software or to fix the blame, the software documentation must speak clearly.
(c) 初次运行后,当系统改进有序时,良好的文档可以在现场进行有效的重新设计、更新和改造。如果不存在文档,通常整个现有的操作软件框架都必须被废弃,即使是相对适度的更改。
(c) Following initial operations, when system improvements are in order, good documentation permits effective redesign, updating, and retrofitting in the field. If documentation does not exist, generally the entire existing framework of operating software must be junked, even for relatively modest changes.
图 33.6显示了一个以前面所示步骤为关键的文档计划。请注意,生成了 6 个文档,并且在交付最终产品时,第 1 号、第 3 号、第 4 号、第 5 号和第 6 号文档已更新并处于最新状态。
Figure 33.6 shows a documentation plan which is keyed to the steps previously shown. Note that six documents are produced, and at the time of delivery of the final product, Documents No. 1, No. 3, No. 4, No. 5, and No. 6 are updated and current.
图 33.6: 步骤 2:确保文档是最新且完整的——至少需要六份独特不同的文档。
Figure 33.6: Step 2: Insure that documentation is current and complete—at least six uniquely different documents are requireds.
除了文档之外,成功的第二个最重要的标准是产品是否完全原创。如果相关计算机程序是第一次开发,则就关键设计/操作领域而言,应安排好最终交付给客户进行操作部署的版本实际上是第二个版本。图 33.7说明了如何通过模拟来实现这一点。请注意,这只是整个过程的缩影,其时间尺度相对于整体工作而言相对较小。这项工作的性质可能有很大差异,主要取决于总体时间尺度和要建模的关键问题领域的性质。如果该工作持续 30 个月,那么试点模型的早期开发可能会计划为 10 个月。对于这个时间表,可以利用相当正式的控制、文件程序等。然而,如果总体工作时间减少到 12 个月,那么试点工作可能会压缩到三个月,以便对主线开发获得足够的影响力。在这种情况下,相关人员需要具备非常特殊的广泛能力。他们必须对分析、编码和程序设计有直观的感觉。他们必须快速感知设计中的问题点,对它们进行建模,对它们的替代方案进行建模,忘记设计中不值得在早期研究的简单方面,并最终得出一个无错误的程序。无论哪种情况,与模拟一样,所有这一切的要点在于,计时、存储等问题现在可以精确研究,否则这些问题需要判断。如果没有这种模拟,项目经理就会受到人类判断的支配。通过模拟,他至少可以对一些关键假设进行实验测试,并缩小剩下的人类判断范围,即计算机程序设计领域(例如起飞总重、完成成本或每日双倍的估计)总是非常乐观。
After documentation, the second most important criterion for success revolves around whether the product is totally original. If the computer program in question is being developed for the first time, arrange matters so that the version finally delivered to the customer for operational deployment is actually the second version insofar as critical design/operations areas are concerned. Figure 33.7 illustrates how this might be carried out by means of a simulation. Note that it is simply the entire process done in miniature, to a time scale that is relatively small with respect to the overall effort. The nature of this effort can vary widely depending primarily on the overall time scale and the nature of the critical problem areas to be modeled. If the effort runs 30 months then this early development of a pilot model might be scheduled for 10 months. For this schedule, fairly formal controls, documentation procedures, etc., can be utilized. If, however, the overall effort were reduced to 12 months, then the pilot effort could be compressed to three months perhaps, in order to gain sufficient leverage on the mainline development. In this case a very special kind of broad competence is required on the part of the personnel involved. They must have an intuitive feel for analysis, coding, and program design. They must quickly sense the trouble spots in the design, model them, model their alternatives, forget the straightforward aspects of the design which aren’t worth studying at this early point, and finally arrive at an error-free program. In either case the point of all this, as with a simulation, is that questions of timing, storage, etc. which are otherwise matters of judgment, can now be studied with precision. Without this simulation the project manager is at the mercy of human judgment. With the simulation he can at least perform experimental tests of some key hypotheses and scope down what remains for human judgment, which in the area of computer program design (as in the estimation of takeoff gross weight, costs to complete, or the daily double) is invariably and seriously optimistic.
图 33.7: 第 3 步:尝试执行两次该工作 - 第一次结果提供了最终产品的早期模拟。
Figure 33.7: Step 3: Attempt to do the job twice—the first result provides an early simulation of the final product.
毫无疑问,项目资源的最大使用者,无论是人力、计算机时间还是管理判断,都是测试阶段。从资金和进度来看,这是最大风险的阶段。它发生在计划中备份替代品最不可用(如果有的话)的最后一点。
Without question the biggest user of project resources, whether it be manpower, computer time, or management judgment, is the test phase. It is the phase of greatest risk in terms of dollars and schedule. It occurs at the latest point in the schedule when backup alternatives are least available, if at all.
前面的三个建议,即在开始分析和编码之前设计程序、完整记录程序以及建立试点模型,都是为了在进入测试阶段之前发现和解决问题。然而,即使在做了这些事情之后,仍然有一个测试阶段,并且仍然有重要的事情要做。图 33.8列出了测试的一些附加方面。在计划测试时,我建议考虑以下事项。
The previous three recommendations to design the program before beginning analysis and coding, to document it completely, and to build a pilot model are all aimed at uncovering and solving problems before entering the test phase. However, even after doing these things there is still a test phase and there are still important things to be done. Figure 33.8 lists some additional aspects to testing. In planning for testing, I would suggest the following for consideration.
1. 测试过程的许多部分最好由测试专家来处理,他们不一定对原始设计有贡献。如果有人认为只有设计师才能执行彻底的测试,因为只有他了解他所建造的区域,那么这肯定是未能正确记录的迹象。有了良好的文档,使用软件产品保证方面的专家是可行的,根据我的判断,他们会比设计人员做得更好。
1. Many parts of the test process are best handled by test specialists who did not necessarily contribute to the original design. If it is argued that only the designer can perform a thorough test because only he understands the area he built, this is a sure sign of a failure to document properly. With good documentation it is feasible to use specialists in software product assurance who will, in my judgment, do a better job of testing than the designer.
2. 大多数错误具有明显的性质,可以通过目视检查轻松发现。每一位分析和每一位代码都应该由第二方进行简单的视觉扫描,该第二方没有进行原始分析或代码,但他们会发现诸如丢失的减号、缺少二的因数、跳转到错误地址之类的事情等,其中校对分析和代码的性质。不要用计算机来检测这种东西——它太贵了。
2. Most errors are of an obvious nature that can be easily spotted by visual inspection. Every bit of an analysis and every bit of code should be subjected to a simple visual scan by a second party who did not do the original analysis or code but who would spot things like dropped minus signs, missing factors of two, jumps to wrong addresses, etc., which are in the nature of proofreading the analysis and code. Do not use the computer to detect this kind of thing—it is too expensive.
3. 通过某种数值检查对计算机程序中的每个逻辑路径至少测试一次。如果我是客户,在该程序完成并经过认证之前,我不会接受交货。此步骤将发现大多数编码错误。
3. Test every logic path in the computer program at least once with some kind of numerical check. If I were a customer, I would not accept delivery until this procedure was completed and certified. This step will uncover the majority of coding errors.
虽然这个测试过程听起来很简单,但对于大型、复杂的计算机程序来说,用受控的输入值遍历每个逻辑路径相对困难。事实上,有些人会认为这几乎是不可能的。尽管如此,我仍坚持我的建议,即每条逻辑路径都至少接受一次真实检查。
While this test procedure sounds simple, for a large, complex computer program it is relatively difficult to plow through every logic path with controlled values of input. In fact there are those who will argue that it is very nearly impossible. In spite of this I would persist in my recommendation that every logic path be subjected to at least one authentic check.
4. 当简单错误(占大多数,并且掩盖了大错误)被消除后,就可以将软件移交给测试区进行检查。在开发过程中的适当时间和在适当的人手中,计算机本身就是最好的结帐设备。关键管理决策包括:最后结账的时间和人员是谁?
4. After the simple errors (which are in the majority, and which obscure the big mistakes) are removed, then it is time to turn over the software to the test area for checkout purposes. At the proper time during the course of development and in the hands of the proper person the computer itself is the best device for checkout. Key management decisions are: when is the time and who is the person to do final checkout?
图 33.8: 步骤 4:计划、控制和监控计算机程序测试。
Figure 33.8: Step 4: Plan, control, and monitor computer program testing.
由于某种原因,即使在事先达成一致之后,软件设计将要做什么也受到广泛的解释。以正式的方式让客户参与进来非常重要,这样他才能在最终交付之前尽早做出承诺。让承包商在需求定义和操作之间自由支配会带来麻烦。图33.9表示三个需求定义之后的要点,客户的洞察力、判断力和承诺可以支持开发工作。
For some reason what a software design is going to do is subject to wide interpretation even after previous agreement. It is important to involve the customer in a formal way so that he has committed himself at earlier points before final delivery. To give the contractor free rein between requirement definition and operation is inviting trouble. Figure 33.9 indicates three points following requirements definition where the insight, judgment, and commitment of the customer can bolster the development effort.
图 33.9: 步骤 5:让客户参与——参与应该是正式的、深入的和持续的。
Figure 33.9: Step 5: Involve the customer—the involvement should be formal, in-depth, and continuing.
图 33.10总结了我认为将有风险的开发过程转变为提供所需产品的过程所必需的五个步骤。我要强调的是,每件物品都需要花费一些额外的钱。如果没有此处描述的五种复杂性的相对简单的过程能够成功运行,那么额外的钱当然就花得不值。然而,根据我的经验,更简单的方法从未适用于大型软件开发工作,并且恢复成本远远超过为列出的五步流程提供资金所需的成本。
Figure 33.10 summarizes the five steps that I feel necessary to transform a risky development process into one that will provide the desired product. I would emphasize that each item costs some additional sum of money. If the relatively simpler process without the five complexities described here would work successfully, then of course the additional money is not well spent. In my experience, however, the simpler method has never worked on large software development efforts and the costs to recover far exceeded those required to finance the five-step process listed.
图 33.10: 总结。
Figure 33.10: Summary.
经电气和电子工程师协会许可,转载自 Royce(1970、1987)。
Reprinted from Royce (1970, 1987), with permission from the Institute of Electrical and Electronics Engineers.
在 Kruskal 未经阐述地声称他的生成树算法是“实用的”(第 17 章)十年后,在讨论图中的最大匹配时,计算机科学家 Jack Edmonds 停下来区分了多项式时间算法和指数时间算法(我们现在这样称呼它们) )。Edmonds (1965) 开头附近的一个“题外话”指出,“需要对‘高效算法’一词的使用进行解释。......出于实际目的,代数阶和指数阶之间的差异通常比有限和非有限之间的差异更重要。” 大约在同一时间,数学家艾伦·科巴姆 (Alan Cobham,1965) 制定了一个与机器模型无关的类,他称之为ℒ,这些函数可在时间上计算,“受所涉及数字长度的多项式限制”,以数字形式表示。
While discussing maximal matchings in graphs a decade after Kruskal’s unelaborated claim that his spanning tree algorithm was “practical” (chapter 17), the computer scientist Jack Edmonds paused to draw the distinction between polynomial- and exponential-time algorithms (as we now call them). A “digression” near the beginning of Edmonds (1965) states, “An explanation is due on the use of the words ‘efficient algorithm.’… For practical purposes the difference between algebraic and exponential order is often more crucial than the difference between finite and non-finite.” At about the same time, the mathematician Alan Cobham (1965) formulated a machine-model-independent class he called ℒ, the functions computable in time “bounded by polynomials in the lengths of the numbers involved” represented in digital form.
在这篇开创性的论文中,Stephen Cook(生于 1939 年)将 Edmonds 称为“代数阶”解决方案问题的类定义为ℒ *,今天称为𝒫或 PTIME。也就是说,ℒ *或𝒫是科巴姆函数类ℒ的集合模拟。然后,库克提供了问题之间多项式时间可约性的关键定义,并证明非确定性图灵机在多项式时间内接受的所有问题都可以在多项式时间上约简为合取范式命题演算的可满足公式集。(该类的𝒩 𝒫表示法在本文发表后不久就被采用 - 参见第 36 章和 Knuth (1974b)。此外,该定理是用同义反复的方式表述的,即 co- 𝒩 𝒫 ,而不是-正如今天已经习惯了——可满足性,就在𝒩 𝒫中。)
In this seminal paper, Stephen Cook (b. 1939) defines as ℒ* the class that Edmonds refers to as problems with “algebraic order” solutions and which are today known as 𝒫 or PTIME. That is, ℒ* or 𝒫 is the set analog of Cobham’s class ℒ of functions. Cook then provides the crucial definition of polynomial-time reducibility between problems, and proves that all problems accepted in polynomial time by a nondeterministic Turing machine are polynomial-time reducible to the set of satisfiable formulas of the propositional calculus in conjunctive normal form. (The 𝒩𝒫 notation for this class was adopted soon after publication of this paper—see chapter 36 and Knuth (1974b). Also, the theorem is stated in terms of tautologyhood, which is in co-𝒩𝒫, rather than—as would be customary today—satisfiability, which is in 𝒩𝒫.)
库克于 1966 年在哈佛大学获得数学博士学位,在逻辑学家王浩的指导下研究乘法和其他数学函数的复杂性。他加入伯克利数学系担任助理教授。1970 年,库克被伯克利大学终身教职拒绝,转而到多伦多大学,这只能被认为是一个错误。他在一次重要的计算机科学理论会议上发表了这篇简短的论文,证明了有时间限制的非确定性计算可以用布尔公式简洁地描述,因此布尔可满足性的多项式时间算法将立即导致每个 𝒩 的多项式时间算法𝒫问题。证明很简单——绝妙的见解是用图灵的遥远回声中的公式来描述计算(1936 年,此处第 60 页)。库克的结果引发了对 是否 是否 = 𝒩 𝒫 以及复杂性理论中数百个其他研究线索的尚未完成的搜索。
Cook received his PhD in mathematics at Harvard in 1966, working under the direction of the logician Hao Wang on the complexity of multiplication and other mathematical functions. He joined the mathematics department at Berkeley as assistant professor. In what can only be considered a blunder, Cook was denied tenure at Berkeley in 1970 and moved to the University of Toronto. He presented this short paper at a major computer science theory conference, demonstrating that time-bounded nondeterministic computations could be described succinctly by boolean formulas, and therefore that a polynomial-time algorithm for boolean satisfiability would immediately result in polynomial-time algorithms for every 𝒩𝒫 problem. The proof is easy—the brilliant insight was to describe computations by formulas in a distant echo of Turing (1936, here page 60). Cook’s result set off the still unfinished search for an answer to the question of whether or not 𝒫 = 𝒩𝒫 and hundreds of other research threads in complexity theory.
𝒫 = 𝒩 𝒫问题还有一个更重要的历史方面。Leonid Levin(生于 1948 年)在苏联独立且相对孤立地工作,发现了同一类“通用搜索问题”(Levin,1973),后来被称为 𝒩 𝒫完全问题。库克和莱文的发现几乎是同时进行的,尽管莱文的发表被推迟了。𝒩 𝒫完全问题的存在现在被称为库克-莱文定理。莱文于1978年移居美国,现任波士顿大学教授。
There is one more important historical aspect of the 𝒫 = 𝒩𝒫 question. Working independently and in relative isolation in the Soviet Union, Leonid Levin (b. 1948) had identified the same class of “Universal Search Problems” (Levin, 1973) that came to be known as the 𝒩𝒫-complete problems. The discoveries by Cook and Levin were nearly simultaneous, though Levin’s publication was delayed. The existence of 𝒩𝒫-complete problems is now known as the Cook–Levin Theorem. Levin emigrated to the US in 1978 and is now a professor at Boston University.
I T表明,由多项式时间限制的非确定性图灵机解决的任何识别问题都可以“简化”为确定给定命题公式是否是同义反复的问题。粗略地说,这里的“简化”意味着如果有预言机可用于解决第二个问题,则可以在多项式时间内确定地解决第一个问题。从这个可简化的概念出发,定义了多项式的难度,并且证明了确定同义反复的问题与确定两个给定图中的第一个图是否与第二个图的子图同构的问题具有相同的多项式度。讨论了其他示例。介绍并讨论了一种测量谓词演算证明过程复杂度的方法。
IT is shown that any recognition problem solved by a polynomial time-bounded nondeterministic Turing machine can be “reduced” to the problem of determining whether a given propositional formula is a tautology. Here “reduced” means, roughly speaking, that the first problem can be solved deterministically in polynomial time provided an oracle is available for solving the second. From this notion of reducible, polynomial degrees of difficulty are defined, and it is shown that the problem of determining tautologyhood has the same polynomial degree as the problem of determining whether the first of two given graphs is isomorphic to a subgraph of the second. Other examples are discussed. A method of measuring the complexity of proof procedures for the predicate calculus is introduced and discussed.
在本文中,一组字符串是指某个固定的、大的、有限的字母表Σ上的一组字符串。该字母表足够大,可以包含此处描述的所有集合的符号。所有图灵机都是确定性识别设备,除非明确说明相反。
Throughout this paper, a set of strings means a set of strings on some fixed, large, finite alphabet Σ. This alphabet is large enough to include symbols for all sets described here. All Turing machines are deterministic recognition devices, unless the contrary is explicitly stated.
让我们修正命题演算的形式,其中公式被写成Σ上的字符串。由于我们需要无限多个命题符号(原子),因此每个这样的符号将由Σ的一个成员组成,后跟一个二进制表示法的数字以区分该符号。因此,长度为n的公式只能具有大约n/ log n不同的函数和谓词符号。逻辑连接词是 ∧(与)、∨(或)和 Ø(非)。
Let us fix a formalism for the propositional calculus in which formulas are written as strings on Σ. Since we will require infinitely many proposition symbols (atoms), each such symbol will consist of a member of Σ followed by a number in binary notation to distinguish that symbol. Thus a formula of length n can only have about n/log n distinct function and predicate symbols. The logical connectives are ∧ (and), ∨ (or), and ¬ (not).
同义反复集合(用 {tautologies} 表示)是该字母表上的某个递归字符串集合,我们感兴趣的是找到其可能识别时间的良好下界的问题。我们在这里没有提供这样的下界,但定理 1 将证明 {同义反复} 是一个难以识别的集合,因为许多明显困难的问题可以简化为确定同义反复。粗略地说,简化的意思是,如果同义反复可以立即确定(通过“神谕”),那么这些问题可以在多项式时间内确定。为了使这个概念更加精确,我们引入了查询机,它就像 Kreider 和 Ritchie (1964) 中带有预言机的图灵机。
The set of tautologies (denoted by {tautologies}) is a certain recursive set of strings on this alphabet, and we are interested in the problem of finding a good lower bound on its possible recognition times. We provide no such lower bound here, but Theorem 1 will give evidence that {tautologies} is a difficult set to recognize, since many apparently difficult problems can be reduced to determining tautologyhood. By reduced we mean, roughly speaking, that if tautologyhood could be decided instantly (by an “oracle”) then these problems could be decided in polynomial time. In order to make this notion precise, we introduce query machines, which are like Turing machines with oracles in Kreider and Ritchie (1964).
查询机是多磁带图灵机,具有称为查询磁带的特殊磁带,以及分别称为查询状态、是状态和否状态的三种不同状态。如果M是一个查询机,T是一组字符串,那么M的T 计算就是M的计算,其中最初M处于初始状态,并且在其输入磁带上有一个输入字符串w,并且每次M假设查询状态在查询磁带上有一个字符串u ,并且如果u ∈ T则下一个状态M假设为是状态,如果u T则假设为否状态。我们想到一个“预言机”,它知道T,将M置于是状态或否状态。
A query machine is a multitape Turing machine with a distinguished tape called the query tape, and three distinguished states called the query state, yes state, and no state, respectively. If M is a query machine and T is a set of strings, then a T-computation of M is a computation of M in which initially M is in the initial state and has an input string w on its input tape, and each time M assumes the query state there is a string u on the query tape, and the next state M assumes is the yes state if u ∈ T and the no state if u T. We think of an “oracle,” which knows T, placing M in the yes state or no state.
定义。字符串集合S可P 约简(P表示多项式)为字符串集合T ,当且仅当存在某个查询机M和多项式Q ( n ),使得对于每个输入字符串w,M与输入的T计算w在Q (| w |) 步内停止(| w | 是w的长度)并以接受状态结束当且仅当w ∈ S。
Definition. A set S of strings is P-reducible (P for polynomial) to a set T of strings iff there is some query machine M and a polynomial Q(n) such that for each input string w, the T-computation of M with input w halts within Q(|w|) steps (|w| is the length of w) and ends in an accepting state iff w ∈ S.
不难看出P-约简性是一个传递关系。因此,字符串集合上的关系E是等价关系,由 ( S, T ) ∈ E给出,当且仅当S和T中的每一个都可以P约简到另一个时。包含集合S 的等价类将用 deg( S ) ( S的多项式难度)表示。
It is not hard to see that P-reducibility is a transitive relation. Thus the relation E on sets of strings, given by (S, T) ∈ E iff each of S and T is P-reducible to the other, is an equivalence relation. The equivalence class containing a set S will be denoted by deg(S) (the polynomial degree of difficulty of S).
定义。我们用ℒ *表示 deg({0}) ,其中 0 表示零函数。
Definition. We will denote deg({0}) by ℒ*, where 0 denotes the zero function.
因此ℒ *是在多项式时间内可识别的集合类。ℒ *在 Cook (1971a, p. 5) 中进行了讨论,它是 Cobham 函数类ℒ的字符串模拟 (Cobham, 1965)。
Thus ℒ* is the class of sets recognizable in polynomial time. ℒ* was discussed in Cook (1971a, p. 5), and is the string analog of Cobham’s class ℒ of functions (Cobham, 1965).
我们现在定义以下特殊的字符串集。
We now define the following special sets of strings.
1.子图问题是给定两个有限无向图,判断第一个无向图是否同构于第二个无向图的问题。图G可以用字母表 {0, 1, *} 上的字符串表示,方法是列出其邻接矩阵的连续行,并用 * 分隔。我们让{子图对}表示字符串集合,
使得G 1与G 2的子图同构。
1. The subgraph problem is the problem given two finite undirected graphs, determine whether the first is isomorphic to a subgraph of the second. A graph G can be represented by a string on the alphabet {0, 1, *} by listing the successive rows of its adjacency matrix, separated by *s. We let {subgraph pairs} denote the set of strings such that G1 is isomorphic to a subgraph of G2.
2.图同构问题将由所有字符串的集合表示,用{同构图对}表示,使得G 1同构于G 2。
2. The graph isomorphism problem will be represented by the set, denoted by {isomorphic graph pairs}, of all strings such that G1 is isomorphic to G2.
3. 集合{primes}是素数的所有二进制表示法的集合。
3. The set {primes} is the set of all binary notations for prime numbers.
4. 集合{DNF重言式}是以析取范式表示重言式的字符串集合。
4. The set {DNF tautologies} is the set of strings representing tautologies in disjunctive normal form.
5. 集合D 3由析取范式的重言式组成,其中每个析取最多有三个连接词(每个连接词是一个原子或原子的否定)。
5. The set D3 consists of those tautologies in disjunctive normal form in which each disjunct has at most three conjuncts (each of which is an atom or negation of an atom).
定理1.如果一组S字符串在多项式时间内被某个非确定性图灵机接受,则S可P约简为 {DNF 同义反复}。
Theorem 1. If a set S of strings is accepted by some nondeterministic Turing machine within polynomial time, then S is P-reducible to {DNF tautologies}.
推论。定义 1-5 中的每个集合都可P简化为 {DNF 同义反复}。
Corollary. Each of the sets in definitions 1–5 is P-reducible to {DNF tautologies}.
这是因为每个集合或其补集都在多项式时间内被某些非确定性图灵机接受。
This is because each set, or its complement, is accepted in polynomial time by some nondeterministic Turing machine.
定理证明1.假设非确定性图灵机M在Q ( n ) 时间内接受一组S字符串,其中Q ( n ) 是多项式。给定M的输入w,我们将以合取范式构造命题公式A ( w ) ,使得A ( w ) 当且仅当M接受w时可满足。因此, Ø A ( w ) 很容易采用析取范式(使用德摩根定律),并且 Ø A ( w ) 是同义反复当且仅当w S。由于整个构造可以在 | 中的多项式限制的时间内完成。瓦| ( w的长度),定理得证。
Proof of Theorem 1. Suppose a nondeterministic Turing machine M accepts a set S of strings within time Q(n), where Q(n) is a polynomial. Given an input w for M, we will construct a proposition formula A(w) in conjunctive normal form such that A(w) is satisfiable iff M accepts w. Thus ¬A(w) is easily put in disjunctive normal form (using De Morgan’s laws), and ¬A(w) is a tautology if and only if w S. Since the whole construction can be carried out in time bounded by a polynomial in |w| (the length of w), the theorem will be proved.
我们不妨假设图灵机M只有一根带子,它向右无穷大,但有一个最左边的正方形。让我们从左到右对正方形编号 1, 2, ...。让我们将输入w固定为长度为n的M,并假设w ∈ S。然后使用输入w计算M ,并在T = Q ( n ) 步内以接受状态结束。公式A ( w ) 将由许多不同的命题符号构建,下面列出的其预期含义指的是这样的计算。
We may as well assume the Turing machine M has only one tape, which is infinite to the right but has a left-most square. Let us number the squares from left to right 1, 2, …. Let us fix an input w to M of length n, and suppose w ∈ S. Then there is a computation of M with input w that ends in an accepting state within T = Q(n) steps. The formula A(w) will be built from many different proposition symbols, whose intended meanings, listed below, refer to such a computation.
假设M的磁带字母表为 { σ 1 , … , σ ℓ },状态集为 { q 1 , … , q r }。请注意,由于计算最多有T = Q ( n ) 个步骤,因此不会扫描超出T 的带方格。
Suppose the tape alphabet for M is {σ1, …, σℓ}, and the set of states is {q1, …, qr}. Notice that since the computation has at most T = Q(n) steps, no tape square beyond T is scanned.
命题符号:
Proposition symbols:
对于 1 ≤ i ≤ ℓ、1 ≤ s、t ≤ T。为真,当且仅当步骤t
处的带平方数s包含符号σ i。
for 1 ≤ i ≤ ℓ, 1 ≤ s, t ≤ T. is true iff tape square number s at step t contains the symbol σi.
对于 1 ≤ i ≤ r,1 ≤ t ≤ T。当且仅当在步骤t机器处于状态q i
时为真。
for 1 ≤ i ≤ r, 1 ≤ t ≤ T. is true iff at step t the machine is in state qi.
S s, t对于 1 ≤ s, t ≤ T为真当且仅当在时间t平方数s被磁带头扫描。
S s, t for 1 ≤ s, t ≤ T is true iff at time t square number s is scanned by the tape head.
式A ( w )是如下形成的合取B∧C∧D∧E∧F∧G∧H∧I 。_ _ _ _ _ _ _ _ 注意A ( w ) 是合取范式。
The formula A(w) is a conjunction B ∧ C ∧ D ∧ E ∧ F ∧ G ∧ H ∧ I formed as follows. Notice A(w) is in conjunctive normal form.
B将断言在每一步t,都会扫描一个且仅一个方格。B是合取B 1 ∧ B 2 ∧ ⋯ ∧ B T,其中B t断言在时间t时仅扫描一个方格:
B will assert that at each step t, one and only one square is scanned. B is a conjunction B1 ∧ B2 ∧⋯∧ BT, where Bt asserts that at time t one and only one square is scanned:
对于 1 ≤ s ≤ T和 1 ≤ t ≤ T,C s, t断言在方格s和时间t处有且仅有一个符号。C是所有C s, t的合取。
For 1 ≤ s ≤ Tand 1 ≤ t ≤ T, Cs, t asserts that at square s and time t there is one and only one symbol. C is the conjunction of all the Cs, t.
D断言对于每个t有且只有一个状态。
D asserts that for each t there is one and only one state.
E断言满足初始条件:
E asserts the initial conditions are satisfied:
其中w = σ i 1 … σ i n,q 0是初始状态,σ 1是空白符号。
where w = σi1…σin, q0 is the initial state and σ1 is the blank symbol.
F、G和H断言每次t时P、Q和S的值都会正确更新。例如,G是 的所有t、i、j的合取,其中
断言如果在时间t机器处于状态q i扫描符号σ j,则在时间t + 1 机器处于状态q k,其中q k是M的转移函数给出的状态。[编辑:库克网站上发布的原始论文版本包括手写注释:“(或声明:非决定论)。” 事实上,只有机器是确定性的,所提出的结构才是正确的。对于非确定性机器,需要采取一些不同的方式。]
F, G, and H assert that for each time t the values of the P’s, Q’s and S’s are updated properly. For example, G is the conjunction over all t, i, j of , where asserts that if at time t the machine is in state qi scanning symbol σj, then at time t + 1 the machine is in state qk, where qk is the state given by the transition function for M. [EDITOR: The version of the original paper posted on Cook’s website includes a handwritten note here: “(or states: nondeterminism).” Indeed, the construction as presented is correct only if the machine is deterministic; it needs to be done somewhat differently for nondeterministic machines.]
最后,公式I断言机器在某个时间达到接受状态。应该修改机器M,使其在达到接受状态后继续以某种简单的方式进行计算,从而满足A ( w )。
Finally, the formula I asserts that the machine reaches an accepting state at some time. The machine M should be modified so that it continues to compute in some trivial fashion after reaching an accepting state, so that A(w) will be satisfied.
现在可以直接验证A ( w ) 具有证明第一段中断言的所有属性。
It is now straightforward to verify that A(w) has all the properties asserted in the first paragraph of the proof.
定理2。以下集合可以成对地彼此P约简(因此每个集合具有相同的多项式难度):{重言式}、{DNF 重言式}、 D 3、{子图对}。
Theorem 2. The following sets are P-reducible to each other in pairs (and hence each has the same polynomial degree of difficulty): {tautologies}, {DNF tautologies}, D3, {subgraph pairs}.
评论。我们无法将 {primes} 或 {isomorphic graph对} 添加到上面的列表中。要证明 {重言式} 可以P - 约简到 {素数} 似乎需要数论中的一些深入结果,而显示 {重言式} 可以P - 约简到 {同构图对} 可能会打乱 Corneil 的猜想(Corneil 和 Gotlieb) ,1970)从中他推断出图同构问题可以在多项式时间内解决。
Remark. We have not been able to add either {primes} or {isomorphic graph pairs} to the above list. To show {tautologies} is P-reducible to {primes} would seem to require some deep results in number theory, while showing {tautologies} is P-reducible to {isomorphic graph pairs} would probably upset a conjecture of Corneil’s (Corneil and Gotlieb, 1970) from which he deduces that the graph isomorphism problem can be solved in polynomial time.
顺便说一句,从 Davis-Putnam 过程(Davis and Putnam,1960)中不难看出,由所有 DNF 同义反复组成且每个析取最多有两个合取的集合D 2位于ℒ *中。因此D 2不能添加到定理 2 中的列表中(除非列表中的所有集合都在ℒ *中)。
Incidentally, it is not hard to see from the Davis–Putnam procedure (Davis and Putnam, 1960) that the set D2 consisting of all DNF tautologies with at most two conjuncts per disjunct, is in ℒ*. Hence D2 cannot be added to the list in Theorem 2 (unless all sets in the list are in ℒ*).
定理2的证明。根据定理 1 的推论,每个集合都可P简化为 {DNF 同义反复}。由于显然{DNF同义反复}是P-可约化为{同义反复},所以仍然需要证明{DNF同义反复}是P-可约化为D 3并且D 3是P-可约化为{子图对}。
Proof of Theorem 2. By the corollary to Theorem 1, each of the sets is P-reducible to {DNF tautologies}. Since obviously {DNF tautologies} is P-reducible to {tautologies}, it remains to show {DNF tautologies} is P-reducible to D3 and D3 is P-reducible to {subgraph pairs}.
为了证明 {DNF 同义反复} 可P约简为D 3,令A为析取范式的命题公式。假设A = B 1 ∨ B 2 ∨ ⋯ ∨ B k,其中B 1 = R 1 ∧ ⋯ ∧ R s,每个R i是一个原子或原子的否定,且s > 3。则A是同义反复,如果并且仅当A ′ 是同义反复,其中
To show {DNF tautologies} is P-reducible to D3, let A be a proposition formula in disjunctive normal form. Say A = B1 ∨B2 ∨⋯∨Bk, where B1 = R1 ∧⋯∧Rs, and each Ri is an atom or negation of an atom, and s > 3. Then A is a tautology if and only if A′ is a tautology where
其中P是一个新原子。由于我们减少了B 1中的合取词数量,因此可以重复该过程,直到最终找到每个析取词最多有三个合取词的公式。显然,整个过程在时间上受A长度的多项式限制。
where P is a new atom. Since we have reduced the number of conjuncts in B1, this process may be repeated until eventually a formula is found with at most three conjuncts per disjunct. Clearly the entire process is bounded in time by a polynomial in the length of A.
仍有待证明D 3可P约简为 {子图对}。假设A是一个析取范式公式,每个析取包含三个连接。因此A = C 1 ∨ ⋯ ∨ C k,其中C i = R i 1 ∧ R i 2 ∧ R i 3,并且每个R ij是一个原子或原子的否定。现在令G 1为顶点为 { v 1 , v 2 , … , v k }的完全图,并令G 2为顶点为 { u ij } 的图, 1 ≤ i ≤ k , 1 ≤ j ≤ 3,这样u ij通过边连接到u rs当且仅当i ≠ r并且两个文字 ( R ij , R rs ) 不形成相反对(即它们都不是 ( P, Ø P )形式也不是 (Ø P, P ))形式。因此,存在对公式A 的伪造真值赋值,当且仅当存在图同态phi : G 1 → G 2使得对于每个i,对于某些j , phi ( vi ) = u ij。(同态告诉每个i R i 1、R i 2、R i 3中的哪一个应该被证伪,并且G 2中选择性缺乏边保证了最终的真值分配是一致指定的。)
It remains to show that D3 is P-reducible to {subgraph pairs}. Suppose A is a formula in disjunctive normal form with three conjuncts per disjunct. Thus A = C1 ∨⋯∨ Ck,where Ci = Ri1 ∧ Ri2 ∧ Ri3, and each Rij is an atom or a negation of an atom. Now let G1 be the complete graph with vertices {v1, v2, …, vk}, and let G2 be the graph with vertices {uij}, 1 ≤ i ≤ k, 1 ≤ j ≤ 3, such that uij is connected by an edge to urs if and only if i ≠ r and the two literals (Rij, Rrs) do not form an opposite pair (that is they are neither of the form (P, ¬P) nor of the form (¬P, P)). Thus there is a falsifying truth assignment to the formula A iff there is a graph homomorphism ϕ: G1 → G2 such that for each i, ϕ(vi) = uij for some j. (The homomorphism tells for each i which of Ri1, Ri2, Ri3 should be falsified, and the selective lack of edges in G2 guarantees that the resulting truth assignment is consistently specified.)
为了保证一对一同态phi : G 1 → G 2具有这样的性质:对于每个i,对于某些j , phi ( vi ) = u ij,我们将G 1和G 2修改如下。我们选择彼此足够不同的图H 1 , H 2 , … , H k ,如果是通过将H i附加到vi从G 1形成的,1 ≤ i ≤ k,并且
由G 2通过将H i附加到u i 1、u i 2和u i 3中的每一个而形成,1 ≤ i ≤ k,则每个一一同态都
具有刚才所述的性质。不难看出这样的构造可以在多项式时间内完成。then
可以嵌入
当且仅当 中
。这样就完成了定理2的证明。
In order to guarantee that a one-one homomorphism ϕ: G1 → G2 has the property that for each i, ϕ(vi) = uij for some j, we modify G1 and G2 as follows. We select graphs H1, H2, …, Hk which are sufficiently distinct from each other that if is formed from G1 by attaching Hi to vi, 1 ≤ i ≤ k, and is formed from G2 by attaching Hi to each of ui1 and ui2 and ui3, 1 ≤ i ≤ k, then every one-one homomorphism has the property just stated. It is not hard to see such a construction can be carried out in polynomial time. Then can be embedded in if and only if . This completes the proof of Theorem 2.
定理1及其推论有力地证明,确定一个给定的命题公式是否是同义反复并不容易,即使该公式是正常的析取形式。定理 1 和 2 一起表明,为子图问题寻找多项式决策过程是徒劳的,因为成功会将多项式决策过程带到许多其他明显棘手的问题上。当然,同样的评论适用于同义反复可P化简的任何组合问题。
Theorem 1 and its corollary give strong evidence that it is not easy to determine whether a given proposition formula is a tautology, even if the formula is in normal disjunctive form. Theorems 1 and 2 together suggest that it is fruitless to search for a polynomial decision procedure for the subgraph problem, since success would bring polynomial decision procedures to many other apparently intractable problems. Of course the same remark applies to any combinatorial problem to which tautologies is P-reducible.
此外,定理表明 {重言式} 是一个不在ℒ *中的有趣集合的良好候选者,我认为值得花费大量精力来证明这个猜想。这样的证明将是复杂性理论的重大突破。
Furthermore, the theorems suggest that {tautologies} is a good candidate for an interesting set not in ℒ*, and I feel it is worth spending considerable effort trying to prove this conjecture. Such a proof would be a major breakthrough in complexity theory.
鉴于 {DNF 同义反复} 的明显复杂性,研究 Davis-Putnam 程序是很有趣的(Davis 和 Putnam,1960)。该过程旨在确定给定的合取范式形式是否可满足,但当然“对偶”过程确定给定的析取范式形式是否是同义反复。我还没有找到一系列例子来表明该过程(同情地对待以避免某些陷阱)必须需要超过多项式的时间。我也没有找到所需时间的有趣上限。
In view of the apparent complexity of {DNF tautologies}, it is interesting to examine the Davis–Putnam procedure (Davis and Putnam, 1960). This procedure was designed to determine whether a given formula in conjunctive normal form is satisfiable, but of course the “dual” procedure determines whether a given formula in disjunctive normal form is a tautology. I have not yet been able to find a series of examples showing the procedure (treated sympathetically to avoid certain pitfalls) must require more than polynomial time. Nor have I found an interesting upper bound for the time required.
如果我们使用m -adic 或其他合适的表示法让字符串表示自然数(或自然数的k元组),则前面部分中的概念可以应用于数字集(或数字上的k位置关系) 。不难看出,某些非确定性图灵机在多项式时间内接受的关系集合正是以下形式的关系集合ℒ +
If we let strings represent natural numbers (or k-tuples of natural numbers) using m-adic or other suitable notation, then the notions in the preceding sections can be made to apply to sets of numbers (or k-place relations on numbers). It is not hard to see that the set of relations accepted in polynomial time by some nondeterministic Turing machine is precisely the set ℒ+ of relations of the form
其中, ℓ ( z ) 是z的二进长度,并且
是ℒ *关系。(ℒ +是Bennett(1962)的扩展正基本关系类。)如果我们去掉公式( 34.1 )中量词的界限,类ℒ +将成为递归可枚举集的类。因此,如果ℒ +是重置类的类比,那么确定同义反复就是停止问题的类比;因为,根据定理 1,{同义反复} 具有完整的ℒ +度,正如停止问题具有完整的 re 度一样。不幸的是,显示停止问题不是递归的对角线参数显然不能适应显示 {重言式} 不在ℒ *中。……
where , ℓ(z) is the dyadic length of z, and is an ℒ* relation. (ℒ+ is the class of extended positive rudimentary relations of Bennett (1962).) If we remove the bound on the quantifier in formula (34.1), the class ℒ+ would become the class of recursively enumerable sets. Thus if ℒ+ is the analog of the class of r.e. sets, then determining tautologyhood is the analog of the halting problem; since, according to Theorem 1, {tautologies} has the complete ℒ+ degree just as the halting problem has the complete r.e. degree. Unfortunately, the diagonal argument which shows the halting problem is not recursive apparently cannot be adapted to show {tautologies} is not in ℒ*. …
经计算机协会许可,由 Cook (1971b) 转载。
Reprinted from Cook (1971b), with permission from the Association for Computing Machinery.
万维网自 20 世纪 90 年代中期才出现,而搜索引擎的历史则更短。但自从文档存在以来,通过搜索关键词来定位文档的问题就一直存在。构建文档中出现的术语索引以方便使用一组有趣术语查找文档的最佳方法是什么?
The World Wide Web has existed only since the mid-1990s, and search engines for less time than that. But the problem of locating documents by searching for keywords has existed as long as documents have existed. What is the best way to construct an index of the terms appearing in documents to make it easy to find a document using a set of interesting terms?
在这篇 1972 年的论文中,Karen Spärck Jones(1935-2007)描述了一种简单的方法,该方法至今仍广泛用于文档检索系统的核心。它有两个组成部分。术语在文档中出现的次数越多,它与文档内容的相关性就越高。例如,一篇重复使用术语“斑马”的论文可能至少在某种程度上是关于斑马的。但当然,这样的论文也会重复使用术语“the”,因此仅仅在文档中出现的频率并不能可靠地表明单词的重要性。
In this 1972 paper, Karen Spärck Jones (1935–2007) described a simple method that is still widely used at the heart of document retrieval systems. It has two components. The more often a term appears in a document, the more relevant it probably is to the document’s content. For example, a paper that uses the term “zebra” repeatedly is probably at least somewhat about zebras. But of course such a paper will also use the term “the” repeatedly, so mere frequency within a document is an unreliable indicator of a word’s significance.
然而,如果某个术语在许多其他文档中也频繁出现,那么该术语在文档中出现频率较高的重要性就必须降低。术语在文档集合中越不常见,对于它出现的少数文档来说就越有可能重要。这个平衡权重因子称为“逆文档频率”或 IDF。如今,大多数网络搜索引擎都使用某种形式的 IDF 作为检索网页的基础。
However, the importance of high frequency of a term within a document must be tempered if that term also appears frequently in many other documents. The less common a term is in a collection of documents, the more likely it is of significance for the few documents in which it does appear. This counter-weighting factor is called the “inverse document frequency” or IDF. Most web search engines today use some form of IDF as a basis for retrieving web pages.
Spärck Jones 的方法经过数学统计的仔细调整,被证明非常有用,尽管他们没有尝试从文档中提取含义、分析句子结构或将相似的单词聚集在一起。她证明,简单的文本统计可以成为自然语言分析中极其强大的工具。
Spärck Jones’s methods, carefully tuned using mathematical statistics, proved to be remarkably useful, even though they made no attempt to extract meaning from documents, to analyze sentence structure, or to cluster similar words together. She demonstrated, that is, that simple text statistics could be extremely powerful tools in the analysis of natural language.
尽管 Spärck Jones 从 20 世纪 50 年代一直在实验室和剑桥大学工作,直到 2002 年退休,并于 1994 年成为计算语言学协会主席,但她直到 1999 年才被授予教授头衔(Bowles,2019)。
Though she worked at laboratories and at the University of Cambridge from the 1950s until she retired in 2002, and became president of the Association for Computational Linguistics in 1994, Spärck Jones was awarded the title of professor only in 1999 (Bowles, 2019).
文档描述的详尽性和索引术语的特殊性通常被认为是独立的。建议特异性应从统计学上解释为术语使用的函数而不是术语含义的函数。检查了术语特异性变化检索的影响,三个测试集合的实验特别表明,为了获得良好的整体性能,需要经常出现的术语。有人认为,应根据收集频率对术语进行加权,以便较不频繁、更具体的术语的匹配比频繁术语的匹配更有价值。测试集合的结果表明,通过这个非常简单的过程可以获得相当大的性能改进。
THE exhaustivity of document descriptions and the specificity of index terms are usually regarded as independent. It is suggested that specificity should be interpreted statistically, as a function of term use rather than of term meaning. The effects on retrieval of variations in term specificity are examined, experiments with three test collections showing, in particular, that frequently-occurring terms are required for good overall performance. It is argued that terms should be weighted according to collection frequency, so that matches on less frequent, more specific, terms are of greater value than matches on frequent terms. Results for the test collections show that considerable improvements in performance are obtained with this very simple procedure.
我们熟悉穷举性和特异性的概念:穷举性是索引描述的属性,而特异性是索引术语的属性之一。它们可以通过简单的关键字或描述符系统最清楚地说明。在这种情况下,文档描述的详尽性是指由分配给它的术语给出的各种主题的覆盖范围;单个术语的特殊性是表示给定概念的详细程度。
We are familiar with the notions of exhaustivity and specificity: exhaustivity is a property of index descriptions, and specificity one of index terms. They are most clearly illustrated by a simple keyword or descriptor system. In this case the exhaustivity of a document description is the coverage of its various topics given by the terms assigned to it; and the specificity of an individual term is the level of detail at which a given concept is represented.
Cleverdon 等人已经讨论了文档检索系统的这些特征。例如,(1966)和兰卡斯特(1968),并且两者的变化的影响都已经被注意到。例如,如果通过分配更多术语来增加文档描述的详尽性,则当索引词汇表中的术语数量恒定时,文档与请求匹配的机会就会增加。给定文档集合的索引穷举的最佳水平的想法如下:每个文档的描述符的平均数量应该进行调整,以便希望匹配相关文档的请求的机会最大化,同时避免太多的错误丢弃。穷举性显然也适用于请求,搜索策略的功能之一就是改变请求的穷举性。然而,我在这里主要关注的是文档描述。
These features of a document retrieval system have been discussed by Cleverdon et al. (1966) and Lancaster (1968), for example, and the effects of variation in either have been noted. For instance, if the exhaustivity of a document description is increased by the assignment of more terms, when the number of terms in the indexing vocabulary is constant, the chance of the document matching a request is increased. The idea of an optimum level of indexing exhaustivity for a given document collection then follows: the average number of descriptors per document should be adjusted so that, hopefully, the chances of requests matching relevant documents are maximized, while too many false drops are avoided. Exhaustivity obviously applies to requests too, and one function of a search strategy is to vary request exhaustivity. I will be mainly concerned here, however, with document descriptions.
上述特征的特异性是索引术语的语义属性:一个术语或多或少是特定的,因为它的含义或多或少是详细和精确的。对于任何关心整个索引词汇表构建的人来说,这是一个自然的观点。除了个别术语的描述性适当性之外,还必须对它们的区分力做出一些决定。例如,索引术语“饮料”可以像术语“茶”、“咖啡”和“可可”一样正确地用于有关茶、咖啡和可可的文档。是否仅将更通用的术语“饮料”纳入词汇表中,或者是否采用“茶”、“咖啡”和“可可”,取决于对后者而非本文档之间的区别的检索效用的判断。以前的。还预测,与单独的术语“茶”、“咖啡”和“可可”相比,更笼统的术语将应用于更多的文档,因此不太具体的术语将比更具体的术语具有更大的集合分布。
Specificity as characterized above is a semantic property of index terms: a term is more or less specific as its meaning is more or less detailed and precise. This is a natural view for anyone concerned with the construction of an entire indexing vocabulary. Some decision has to be made about the discriminating power of individual terms in addition to their descriptive propriety. For example, the index term “beverage” may be as properly used for documents about tea, coffee, and cocoa as the terms “tea”, “coffee”, and “cocoa”. Whether the more general term “beverage” only is incorporated in the vocabulary, or whether “tea”, “coffee”, and “cocoa” are adopted, depends on judgements about the retrieval utility of distinctions between documents made by the latter but not the former. It is also predicted that the more general term would be applied to more documents than the separate terms “tea”, “coffee”, and “cocoa”, so the less specific term would have a larger collection distribution than the more specific ones.
当然,这里假设构建词汇时的此类选择是排他性的:我们可能有“饮料”或“茶”、“咖啡”和“可可”。如果我们拥有所有四个项,会发生什么则是另一回事。然后,我们可以将“饮料”解释为“其他饮料”,或者明确地将其视为相关的更广泛的术语。然而,我在这里将忽略这些替代方案。
It is of course assumed here that such choices when a vocabulary is constructed are exclusive: we may either have “beverage” or “tea”, “coffee”, and “cocoa”. What happens if we have all four terms is a different matter. We may then either interpret “beverage” to mean “other beverages” or explicitly treat it as a related broader term. I will, however, disregard these alternatives here.
在建立索引词汇时,从一个角度考虑索引术语的特殊性:我们关心对文档描述以及检索的可能影响,选择特定的术语,或者更确切地说,采用一组特定的术语。因为我们的决策将部分受到术语之间的关系以及所选术语集如何共同表征文档集的影响。但自始至终,我们都假设一定程度的索引详尽性。我们关心的是为一些众所周知的主题和大小的文档集合获取有效的词汇表,其中给定的索引详尽程度被认为足以充分表示单个文档的内容,并将一个文档与另一个文档区分开来。
In setting up an index vocabulary the specificity of index terms is looked at from one point of view: we are concerned with the probable effects on document description, and hence retrieval, of choosing particular terms, or rather of adopting a certain set of terms. For our decisions will, in part, be influenced by relations between terms, and how the set of chosen terms will collectively characterize the set of documents. But throughout we assume some level of indexing exhaustivity. We are concerned with obtaining an effective vocabulary for a collection of documents of some broadly known subject matter and size, where a given level of indexing exhaustivity is believed to be sufficient to represent the content of individual documents adequately, and distinguish one document from another.
然而,必须从另一个角度来看待指数术语的特殊性。当实际使用给定的索引词汇时会发生什么?例如,我们预测当我们选择“饮料”时,它会比“可可”被更多地使用。但我们不太清楚有多少文档可以适当地分配“饮料”。即使假设某种程度的详尽性,这也不是简单确定的。可以说,会有一些文件急需“饮料”,我们可能知道这可能占收藏的比例。也会有一些文件无法合理地分配“饮料”,并且这个比例也可以被估计。但不幸的是,可能有一些文件可能被分配或不被分配“饮料”,无论哪种情况都很合理。因此,一般来说,描述符的实际使用可能与预测使用有很大差异。一个术语所属和不属于的集合的比例只能非常粗略地估计;并且可能有足够的中间文档来将术语分配给这些术语,从而显着影响其整体分布。在很长一段时间内,集合的整体特征也可能会发生变化,从而进一步影响术语分布。
Index term specificity must, however, be looked at from another point of view. What happens when a given index vocabulary is actually used? We predict when we opt for “beverage”, for example, that it will be used more than “cocoa”. But we do not have much idea of how many documents there will be to which “beverage” may appropriately be assigned. This is not simply determined even when some level of exhaustivity is assumed. There will be some documents which cry out for “beverage” so to speak, and we may have some idea of what proportion of the collection this is likely to be. There will also be documents to which “beverage” cannot justifiably be assigned, and this proportion may also be estimated. But there is unfortunately liable to be some number of documents to which “beverage” may or may not be assigned, in either case quite plausibly. In general, therefore, the actual use of a descriptor may diverge considerably from the predicted use. The proportions of a collection to which a term does and does not belong can only be estimated very roughly; and there may be enough intermediate documents for the way the term is assigned to these to affect its overall distribution considerably. Over a long period the character of the collection as a whole may also change, with further effects on term distribution.
这就是描述的详尽程度很重要的地方。随着集合的增长,保持一定程度的详尽性可能意味着不同文档的描述没有充分区分,而某些术语的使用非常频繁。更一般地说,术语分布可能会出现很大的变化。因此,可能出现这样的情况:某个特定术语作为检索手段变得不太有效,无论其实际含义如何。这是因为它没有歧视性。它可以正确地分配给文档,因为它们的内容证明了分配的合理性;但它本身可能不再足够有用,作为区分与请求相关的典型小类文档和集合的其余部分的工具。因此,频繁使用的术语在检索中充当非特定术语,即使其含义在普通意义上可能非常特定。
This is where the level of exhaustivity of description matters. As a collection grows maintaining a certain level of exhaustivity may mean that the descriptions of different documents are not sufficiently distinguished, while some terms are very heavily used. More generally, great variation in term distribution is likely to appear. It may thus be the case that a particular term becomes less effective as a means of retrieval, whatever its actual meaning. This is because it is not discriminating. It may be properly assigned to documents, in the sense that their content justifies the assignment; but it may no longer be sufficiently useful in itself as a device for distinguishing the typically small class of documents relevant to a request from the remainder of the collection. A frequently used term thus functions in retrieval as a nonspecific term, even though its meaning may be quite specific in the ordinary sense.
换句话说,在建立索引词汇表时仅仅考虑索引术语的特异性是不够的,因为它与概念表示的准确性有关。我们应该将特异性视为术语使用的函数。它应该被解释为索引术语的统计属性而不是语义属性。一般来说,我们可能期望更频繁地使用更模糊的术语,但单个术语的行为将是不可预测的。因此,我们可以重新定义简单术语系统的穷举性和特异性:文档描述的穷举性是它包含的术语数量。包含,术语的特殊性是它所属的文档的数量。两者之间的关系就很清楚了,例如,我们可以看到,描述的详尽性的变化将影响术语的特异性:如果描述更长,术语将更频繁地使用。这对于受控词汇来说是不可避免的,但如果使用提取的关键字,尤其是词干形式,这也适用。关键词词汇中新词的出现率并不简单地与索引的文档数量平行,并且每个文档提取更多关键词更有可能增加当前关键词的频率而不是生成新关键词。
It is not enough, in other words, to think of index term specificity solely in setting up an index vocabulary, as having to do with accuracy of concept representation. We should think of specificity as a function of term use. It should be interpreted as a statistical rather than semantic property of index terms. In general we may expect vaguer terms to be used more often, but the behaviour of individual terms will be unpredictable. We can thus redefine exhaustivity and specificity for simple term systems: the exhaustivity of a document description is the number of terms it contains, and the specificity of a term is the number of documents to which it pertains. The relation between the two is then clear, and we can see, for instance, that a change in the exhaustivity of descriptions will affect term specificity: if descriptions are longer, terms will be used more often. This is inevitable for a controlled vocabulary, but also applies if extracted keywords are used, particularly in stem form. The incidence of words new to the keyword vocabulary does not simply parallel the number of documents indexed, and the extraction of more keywords per document is more likely to increase the frequency of current keywords than to generate new ones.
一旦认识到这种对特异性的统计解释以及它与穷举性之间的关系,很自然地会尝试一种更正式的方法来寻求给定集合的词汇表特异性的最佳水平和索引中穷举性的最佳水平。 。在由合理术语(即可以从请求到达并应用于文档的术语)所施加的广泛限制内,我们可以尝试建立具有统计属性的词汇表,该词汇表有望最适合检索。对于一定程度的文档区分,纯粹形式化的计算可能会建议正确的术语数量以及每个文档的术语数量。例如,Zunde 和 Slamecka (1967) 就已经完成了这些方面的工作。更通俗地说,例如 Salton (1968) 提出的描述符应设计为具有大致相同的分布的建议,是出于对术语使用的纯粹统计特征的检索效果的尊重。
Once this statistical interpretation of specificity, and the relation between it and exhaustivity, are recognized, it is natural to attempt a more formal approach to seeking an optimum level of specificity in a vocabulary and an optimum level of exhaustivity in indexing, for a given collection. Within the broad limits imposed by having sensible terms, i.e. ones which can be reached from requests and applied to documents, we may try to set up a vocabulary with the statistical properties which are hopefully optimal for retrieval. Purely formal calculations may suggest the correct number of terms, and of terms per document, for a certain degree of document discrimination. Work on these lines has been done by Zunde and Slamecka (1967), for instance. More informally, the suggestion that descriptors should be designed to have approximately the same distribution, made by Salton (1968), for example, is motivated by respect for the retrieval effects of purely statistical features of term use.
不幸的是,抽象计算不选择实际项。文档集合也不是静态的。更重要的是,请求很难控制。人们可能会为了很好地区分文档而对它们进行表征,然后发现用户没有利用这些区别来提供请求。因此,我们可能被迫接受事实上的非最佳情况,其中条款具有不同的具体性,并且至少有一些令人不快的非具体条款。有些术语,无论其初衷是什么,都会检索大量文档,而其中只有一小部分预计与请求相关。总的来说,这些术语比罕见的、过于具体的无法检索文档的术语更令人讨厌。
Unfortunately, abstract calculations do not select actual terms. Nor are document collections static. More importantly, it is difficult to control requests. One may characterize documents with a view to distinguishing them nicely and then find that users do not provide requests utilizing these distinctions. We may therefore be forced to accept a de facto non-optimal situation with terms of varying specificity and at least some disagreeably non-specific terms. There will be some terms which, whatever the original intention, retrieve a large number of documents, of which only a small proportion can be expected to be relevant to a request. Such terms are on the whole more of a nuisance than rare, over-specific terms which fail to retrieve documents.
术语行为的这些特征可以通过来自 Aslib Cranfield、INSPEC 和威尔士图书馆学院项目的三个著名测试集的示例来说明。事实上,在这些词汇中,词汇表由提取的关键词词干组成,预计它们会比受控术语显示出更多的变化。但没有理由认为情况有本质上的不同。Cleverdon 等人给出了这些集合的完整描述。(1966),艾奇森等人。(1970),以及基恩和迪格(1972)。图 35.1的 A 部分给出了集合的相关特征。例如,INSPEC Collection 有 541 个文档,由 1,341 个术语索引。在所有的集合中,都有一些非常频繁出现的术语:例如在克兰菲尔德集合中,一个术语出现在 200 个文档中的 144 个中;在 INSPEC 中,某个术语出现在 541 个文档中的 112 个中,而在 Keen 集合中,某个术语出现在 797 个文档中的 199 个中。相关术语不一定代表集合主题领域的核心概念,并且它们并不总是通用术语。在关于信息科学的 Keen 合集中,最常见的术语是“index-”,并且其他常见的包括“librar-”、“inform-”和“comput-”。在 INSPEC 集合中最常见的是“理论-”,其次是“测量-”和“方法-”。在克兰菲尔德集合中最常见的是“流动-”,其次是“压力-”,“分布-”和“边界-”(边界)。较罕见的术语是一个精致的混合包,包括 Keen 的“purchas-”和“xerograph-”,INSPEC 的“parallel-”和“silver-”,以及 Cranfield 的“logarithm-”和“seri-”(系列)。
These features of term behaviour can be illustrated by examples from three well-known test collections, obtained from the Aslib Cranfield, INSPEC, and College of Librarianship Wales projects. In fact in these the vocabulary consists of extracted keyword stems, which may be expected to show more variation than controlled terms. But there is no reason to suppose that the situation is essentially different. Full descriptions of the collections are given in Cleverdon et al. (1966), Aitchison et al. (1970), and Keen and Digger (1972). Relevant characteristics of the collections are given in Section A of Figure 35.1. The INSPEC Collection, for instance, has 541 documents indexed by 1,341 terms. In all the collections, there are some very frequently occurring terms: for example in the Cranfield collection, one term occurs in 144 out of 200 documents; in the INSPEC one term occurs in 112 out of 541, and in the Keen collection one term occurs in 199 out of 797 documents. The terms concerned do not necessarily represent concepts central to the subject areas of the collections, and they are not always general terms. In the Keen collection, which is about information science, the most frequent term is “index-”, and other frequent ones include “librar-”, “inform-”, and “comput-”. In the INSPEC collection the most frequent is “theor-”, followed by “measur-” and “method-”. And in the Cranfield collection the most frequent is “flow-”, followed by “pressur-”, “distribut-” and “bound-” (boundary). The rarer terms are a fine mixed bag including “purchas-”, and “xerograph-” for Keen, “parallel-” and “silver-” for INSPEC, and “logarithm-” and “seri-” (series) for Cranfield.
当这些术语出现在请求中时,人们应该如何应对可变的术语特异性,尤其是不够具体的术语?原则上,频繁使用术语所带来的不良影响可以通过术语组合非常自然地解决。例如,尽管“bound-”、“layer-”和“flow-”这三个术语分别出现在 Cranfield 集合中的 73、62 和 144 个文档中,但所有这三个术语一起索引的文档只有 50 个。依赖术语连接非常简单。特别是,它是克服以下事实所带来的不良后果的一种方法:请求往往以更广为人知、因此通常更频繁的术语来提出。不幸的是,但并不奇怪的是,请求往往以平均频率远远高于整个索引词汇表的频率的方式呈现。这适用于所有三个测试集合,如图35.1的 B 部分所示。例如,对于克兰菲尔德馆藏,词汇表中术语的平均发布次数为 9,而请求中使用的术语的平均数量为 31.6;对于 Keen 来说,这个数字是 6.1 和 44.8。
How should one cope with variable term specificity, and especially with insufficiently specific terms, when these occur in requests? The untoward effects of frequent term use can in principle be dealt with very naturally, through term combinations. For instance, though the three terms “bound-”, “layer-”, and “flow-” occur in 73, 62, and 144 documents each in the Cranfield collection, there are only 50 documents indexed by all three terms together. Relying on term conjunction is quite straightforward. It is in particular a way of overcoming the untoward consequences of the fact that requests tend to be formulated in better known, and hence generally more frequent, terms. It is unfortunate, but not surprising, that requests tend to be presented in terms with an average frequency much above that for the indexing vocabulary as a whole. This holds for all three test collections, as appears in Section B of Figure 35.1. For the Cranfield collection, for example, the average number of postings for the terms in the vocabulary is nine, while the average for the terms used in the requests is 31.6; for Keen the figures are 6.1 and 44.8.
但众所周知,依靠术语组合来减少误报是有风险的。确实,文档和请求之间的共同术语越多,该文档与请求相关的可能性就越大。不幸的是,匹配术语连词恰好很困难。三个集合的术语匹配行为很好地体现了这一点,如下所示如图 35.1的 C 部分所示。每个请求的平均起始术语数范围从基恩的 5.3 个到克兰菲尔德的 6.9 个。但每个请求的平均检索词数,即最高匹配分数的平均值,范围为 3.2 到 5.0。更重要的是,检索到的相关文档的匹配术语的平均数量从 Keen 的 1.8 到 Cranfield 的 3.6 不等,但幸运的是,所有检索到的文档(主要是不相关的)的平均匹配术语数量仅为 1.2 到 1.8。
But relying on term combination to reduce false drops is well-known to be risky. It is true that the more terms in common between a document and a request, the more likely it is that the document is relevant to the request. Unfortunately, it just happens to be difficult to match term conjunctions. This is well exhibited by the term-matching behaviour of the three collections, as shown in Section C of Figure 35.1. The average number of starting terms per request ranges from 5.3 for Keen to 6.9 for Cranfield. But the average number of retrieving terms per request, i.e. the average of the highest matching scores, ranges from 3.2 to 5.0. More importantly, the average number of matching terms for the relevant documents retrieved ranges from only 1.8 for Keen to 3.6 for Cranfield, though fortunately the average for all documents retrieved, which are predominantly non-relevant, ranges from a mere 1.2 to 1.8.
显然,解决这个问题的一种方法是以某种方式提供更多的匹配项。这可以通过分类为给定术语提供替代替代品来实现;或者通过增加文档或请求规范的详尽性,例如添加统计相关术语。但任何一种方法都需要付出努力,也许是相当大的努力,因为必须识别与各个术语相关的术语集。自然产生的问题是,是否可以在不涉及这种努力的情况下更好地使用现有的术语描述。
Clearly, one solution to this problem is to provide for more matching terms in some way. This may be achieved either by providing alternative substitutes for given terms, through a classification; or by increasing the exhaustivity of document or request specifications, say by adding statistically associated terms. But either approach involves effort, perhaps considerable effort, since the sets of terms related to individual terms must be identified. The question naturally arises as to whether better use of existing term descriptions can be made which does not involve such effort.
由于非常频繁出现的术语会导致检索中出现噪音,一种可能的做法是将它们从请求中删除。这将减少可用于联合匹配的术语数量,这一事实可能会被检索到的不相关文档减少这一事实所抵消。不幸的是,虽然频繁的术语会产生噪音,但它们也是相当高的召回率所必需的。对于所有三个测试集合,通过应用合适的阈值删除非常频繁的术语会导致整体性能下降。例如,对于 INSPEC 集合,阈值设置为删除 20 个或更多文档中出现的术语,从而删除了 1,341 个总词汇表中的 73 个术语。图 35.2的 Cranfield 集合的查全率/精确度图说明了检索性能的影响。匹配是通过简单的术语协调级别进行的,而对一组请求的平均是通过简单的数字平均值来进行的。然后对十个标准召回值的精度进行插值。其他集合也表现出完整术语匹配与仅与非频繁术语的限制匹配之间的相同关系:召回上限降低了至少 30%,事实上,对于 Keen 集合,召回上限从 75% 降低到 25%百分比,但精度保持不变。
As very frequently occurring terms are responsible for noise in retrieval, one possible course is simply to remove them from requests. The fact that this will reduce the number of terms available for conjoined matching may be offset by the fact that fewer non-relevant documents will be retrieved. Unfortunately, while frequent terms cause noise, they are also required for reasonably high recall. For all three test collections, the deletion of very frequent terms by the application of a suitable threshold leads to a decline in overall performance. For the INSPEC collection, for example, the threshold was set to delete terms occurring in 20 or more documents, so that 73 terms out of the total vocabulary of 1,341 were removed. The effect in retrieval performance is illustrated by the recall/precision graph of Figure 35.2 for the Cranfield collection. Matching is by simple term co-ordination levels, and averaging over the set of requests is by straightforward average of numbers. Precision at ten standard recall values is then interpolated. The same relationship between full term matching and this restricted matching with non-frequent terms only is exhibited by the other collections: the recall ceiling is lowered by at least 30 per cent, and indeed for the Keen collection is reduced from 75 per cent to 25 per cent, though precision is maintained.
对请求的检查表明了为什么会得到这个结果。不仅请求词频率远高于平均收集频率;而且 相对较少的非常频繁的术语在请求的制定中起着很大的作用。例如,“Flow-”出现在 42 个克兰菲尔德请求中的 12 个中,一般来说,对于所有三个集合,请求中大约一半的术语是非常频繁的术语,如图35.1的 D 部分所示。扔掉非常频繁的术语就像把婴儿和洗澡水一起倒掉,因为检索许多相关文档都需要它们。非频繁项的组合具有区分性,但不会超过频繁项和非频繁项的组合。另一方面,当仅使用频繁项的匹配与完全匹配进行比较时,非频繁项的价值显而易见,如图35.2所示。总文档和相关文档的匹配级别几乎相同对于所有项来说都很高,但是后者中的非频繁项将相关匹配级别提高了大约1。
Inspection of the requests shows why this result is obtained. Not merely is request term frequency much above average collection frequency; the comparatively small number of very frequent terms plays a large part in request formulation. “Flow-” for example, appears in twelve Cranfield requests out of 42, and in general for all three collections about half the terms in a request are very frequent ones, as shown in Section D of Figure 35.1. Throwing very frequent terms away is throwing the baby out with the bath water, since they are required for the retrieval of many relevant documents. The combination of non-frequent terms is discriminating, but no more than that of frequent and non-frequent terms. The value of the non-frequent terms is clearly seen, on the other hand, when matching using frequent terms only is compared with full matching, also shown in Figure 35.2. Matching levels for total and relevant documents are nearly as high as for all terms, but the non-frequent terms in the latter raise the relevant matching level about 1.
术语检索的这些特征表明,为了提高初始完整术语的性能,我们需要利用非常频繁和非频繁术语的良好特征,同时最大限度地减少它们的不良特征。我们应该在频繁的学期比赛中允许一些优点,同时在不频繁的学期比赛中允许更多的优点。无论如何,我们希望最大化匹配项的数量。
These features of term retrieval suggest that to improve on the initial full term performance we need to exploit the good features of very frequent and non-frequent terms, while minimizing their bad ones. We should allow some merit in frequent term matches, while allowing rather more in non-frequent ones. In any case we wish to maximize the number of matching terms.
这清楚地表明了一种加权方案。在正常的术语协调匹配中,如果请求和文档具有共同的频繁术语,则该术语与非频繁术语一样重要;因此,如果请求和文档共享三个常用术语,则该文档将与与该请求共享三个稀有术语的另一个文档处于同一级别检索。但看来我们应该将非频繁项的匹配视为比频繁项的匹配更有价值,而不是完全忽视后者。自然的解决方案是将术语的匹配值与其收集频率相关联。在这个阶段,将术语分为频繁项和非频繁项是任意的,而且可能不是最佳的:优雅且几乎肯定更好的方法是将匹配值与相对频率更紧密地联系起来。词汇表的术语分布曲线建议了执行此操作的适当方法,该曲线具有熟悉的 Zipf 形状。令f ( n ) = m使得 2 m −1 < n ≤ 2 m。那么,如果集合中有N 个文档,则出现n次的术语的权重为f ( N ) − f ( n ) + 1。例如,对于包含 200 个文档的 Cranfield 集合,这意味着出现 90 次的术语出现次数的权重为 2,而出现 3 次的权重为 7。
This clearly suggests a weighting scheme. In normal term co-ordination matches, if a request and document have a frequent term in common, this counts for as much as a non-frequent one; so if a request and document share three common terms, the document is retrieved at the same level as another one sharing three rare terms with the request. But it seems we should treat matches on non-frequent terms as more valuable than ones on frequent terms, without disregarding the latter altogether. The natural solution is to correlate a term’s matching value with its collection frequency. At this stage the division of terms into frequent and non-frequent is arbitrary and probably not optimal: the elegant and almost certainly better approach is to relate matching value more closely to relative frequency. The appropriate way of doing this is suggested by the term distribution curve for the vocabulary, which has the familiar Zipf shape. Let f(n) = m such that 2m−1 < n ≤ 2m. Then where there are N documents in the collection, the weight of a term which occurs n times is f(N) − f(n) + 1. For the Cranfield collection with 200 documents, for example, this means that a term occurring ninety times has weight 2, while one occurring three times has weight 7.
因此,术语的匹配值与其特异性相关,并且文档的检索级别由其匹配术语的值的总和确定。简单的协调级别被更复杂的准排名所取代。这种效果可以通过两个文档分别匹配相同数量的相对频繁和相对不频繁的术语的请求的不同检索级别来说明。使用克兰菲尔德值范围,匹配频率为 15 和 43 的两个术语的文档将在级别 5 + 3 = 8 检索,而匹配频率为 3 和 7 的术语的文档将在级别 7 + 6 = 13 检索。显然,随着级别范围的“延伸”,更多的歧视是可能的。
The matching value of a term is thus correlated with its specificity and the retrieval level of a document is determined by the sum of the values of its matching terms. Simple co-ordination levels are replaced by a more sophisticated quasi-ranking. The effect can be illustrated by the different retrieval levels at which two documents matching a request on the same number of relatively frequent and relatively non-frequent terms respectively. With the Cranfield range of values, a document matching on two terms with frequencies 15 and 43 will be retrieved at level 5 + 3 = 8, while one matching on terms with frequencies 3 and 7 will be retrieved at level 7 + 6 = 13. Clearly, as the range of levels is “stretched”, more discrimination is possible.
术语加权的想法并不新鲜。但它通常与术语相对于文档本身的假定重要性有关。例如,如果一篇文档主要是关于油漆的,并且只顺便提到了清漆,我们可以使用一些简单的权重标度来为术语“油漆”分配权重 2,为“清漆”分配权重 1。更非正式地,在提出请求时,我们可以声明在搜索过程中必须保留术语x ,但可以删除术语y 。如果有必要的信息,可以在统计基础上采用更系统的加权。如果文档(或摘要)中术语的实际出现频率已知,则可以使用它来生成权重。Artandi 和 Wolf (1969) 报道了使用频率从三点量表中选择权重,而 Salton 和 Lesk (1968) 则更全心全意地使用频率作为权重出现。在一系列实验中,索尔顿已经证明,与未加权项相比,以这种方式对项进行加权可以显着提高性能。
The idea of term weighting is not new. But it is typically related to the presumed importance of a term with respect to a document in itself. For instance, if a document is mainly about paint and only mentions varnish in passing, we may utilize some simple weighting scale to assign a weight of 2 to the term “paint” and 1 to “varnish”. More informally, in putting a request, we may state that during searching term x must be retained, but term y may be dropped. More systematic weighting on a statistical base may be adopted if the necessary information is available. If the actual frequency of occurrence of terms in a document (or abstract) is known, this may be used to generate weights. Artandi and Wolf (1969) report the use of frequency to select a weight from a three-point scale, while Salton and Lesk (1968) more wholeheartedly uses the frequency of occurrence as a weight. In a range of experiments Salton has demonstrated that weighting terms in this way leads to a noticeable improvement in performance over that obtained for unweighted terms.
按收集频率进行加权与按文档频率进行加权有很大不同。它更强调术语作为区分一个文档与另一个文档的手段的价值,而不是其作为文档本身内容的指示的价值。两种加权形式之间的关系并不明显。在某些情况下,某个术语可能在文档中很常见,但在集合中很少见,因此它在两种方案中的权重都很大。但反之亦然。事实上,重点是术语的不同属性。
Weighting by collection frequency as opposed to document frequency is quite different. It places greater emphasis on the value of a term as a means of distinguishing one document from another than on its value as an indication of the content of the document itself. The relation between the two forms of weighting is not obvious. In some cases a term may be common in a document and rare in the collection, so that it would be heavily weighted in both schemes. But the reverse may also apply. It is really that the emphasis is on different properties of terms.
与术语匹配相关的术语收集频率的处理似乎尚未得到系统研究。Lesk 等人已经研究了术语频率对统计关联的影响,但这是另一回事。给定术语可能检索大量文档的事实可能会在设置搜索时被非正式地利用,特别是在 Borko (1968) 所描述的在线检索的上下文中。由于缺乏必要的信息,更全心全意的方法可能会受到阻碍。所描述的这种过程也比手动搜索更适合自动搜索。因此,有趣的是,术语频率已以 AD Little 实施的内部报告的操作交互式检索系统中指示的一般方式被利用(Curtice 和 Jones,1968)。在该系统中,索引关键字是从文本中自动提取的,因此权重与不断变化的词汇和集合相关联。但尚未见系统实验报道。
The treatment of term collection frequency in connection with term matching does not seem to have been systematically investigated. The effect of term frequency on statistical associations has been studied, for example by Lesk, but this is a different matter. The fact that a given term is likely to retrieve a large number of documents may be informally exploited in setting up searches, in particular in the context of on-line retrieval as described by Borko (1968), for example. More whole-hearted approaches are probably hampered by the lack of the necessary information. Such a procedure as the one described is also much more suited to automatic than manual searching. It is of interest, therefore, that term frequencies have been exploited in the general manner indicated within an operational interactive retrieval system for internal reports implemented at A. D. Little (Curtice and Jones, 1968). In this system indexing keywords are extracted automatically from text, and the weighting is therefore associated with a changing vocabulary and collection. However, no systematic experiments are reported.
所描述的术语权重系统在三个集合上进行了尝试。如前所述,它们在性质上有很大不同,具有不同大小的词汇量、文档描述和请求规范,如图35.1所示。然而,在所有情况下,与简单的术语匹配相比,使用术语权重进行匹配可以显着提高性能。以前面提到的形式呈现的结果如图35.3、35.4和35.5所示。基于曲线所包围面积差异的简单显着性检验表明,加权项给出的改进是完全显着的,差异远高于所需的最小值。
The term weighting system described was tried on the three collections. As noted, these are very different in character, with different sizes of vocabulary, document description, and request specification, as indicated in Figure 35.1. In all cases, however, matching with term weighting led to a substantial improvement in performance over simple term matching. The results presented in the form mentioned earlier, are given in Figures 35.3, 35.4 and 35.5. A simple significance test based on the difference in area enclosed by the curves shows that the improvement given by weighted terms is fully significant, the difference being well above the required minimum.
这些结果之所以令人感兴趣有两个原因。所有三个集合都已用于不同索引语言、搜索技术等的一系列实验:参见 Cleverdon 等人。(1966);索尔顿(1968);索尔顿和莱斯克 (1968);斯帕克·琼斯 (1971);艾奇森等人。(1970);基恩与挖掘者 (1972)。尽管如此,这里获得的性能改进与通过任何其他方式(包括精心构建的同义词库)获得的简单未加权关键字词干匹配相比都有很好的改进:索尔顿的迭代搜索方法是不可比的。这些实验结果呈现方式的细节各不相同,因此不可能进行严格的比较:但总体情况是清晰的。事实上,就信息检索研究中任何可以称为可靠结果的事物而言,这就是其中之一。第二目前的结果的要点是,性能的提高是通过极其简单的方法获得的。它与最初简单的索引方法兼容,即使用提取的关键字,可以自动将其简化为词干;考虑到自动术语匹配程序,它很容易实现,因为所需的只是术语频率列表,并且很容易获得;它的优点是分配给术语的权重会随着集合的增长和变化而自然地调整。显然需要对比此处使用的集合大得多的集合进行实验;希望他们不会被拖延太久。
These results are of interest for two reasons. All three collections have been used for a whole range of experiments with different index languages, search techniques, and so on: see Cleverdon et al. (1966); Salton (1968); Salton and Lesk (1968); Spärck Jones (1971); Aitchison et al. (1970); Keen and Digger (1972). The performance improvement obtained here nevertheless represents as good an improvement over simple unweighted keyword stem matching as has been obtained by any other means, including carefully constructed thesauri: Salton’s iterative search methods are not comparable. The details of the way these experimental results are presented varies, so rigorous comparisons are impossible: but the general picture is clear. Indeed, insofar as anything can be called a solid result in information retrieval research, this is one. The second point about the present results is that the improvement in performance is obtained by extremely simple means. It is compatible with an initially plain method of indexing, namely the use of extracted keywords, which may be reduced to stems automatically; it is readily implemented given an automatic term-matching procedure, since all that is required is a term frequency list and this is easily obtained; and it has the merit that the weight assigned to terms is naturally adjusted to follow the growth of and changes in a collection. Experiments with very much larger collections than those used here are clearly desirable; they will hopefully not be long delayed.
经 Emerald Group Publishing Limited 许可,转载自 Spärck Jones (1972)。
Reprinted from Spärck Jones (1972), with permission from Emerald Group Publishing Limited.
理查德·“迪克”·卡普(Richard “Dick” Karp,生于 1935 年)出生于波士顿多切斯特附近。他的父亲是一名学校教师,后来成为学校校长。卡普喜欢拼图和游戏,并在早期接受了组合爆炸的实践课程,帮助他的父亲通过翻牌来安排学校的班级日历,寻找满足教室使用、教师可用性等各种限制的时间表。
Richard “Dick” Karp (b. 1935) was born in the Dorchester neighborhood of Boston. His father was a school teacher who rose to be a school principal. Karp liked puzzles and games, and received an early practical lesson in combinatorial explosion helping his father arrange the school’s class calendar by shuffling cards in search of a schedule that satisfied various constraints about use of classrooms, availability of teachers, and so on.
卡普在哈佛大学学习数学,发现离散数学特别有吸引力,因为它激发了他解谜的本能,也因为他被班上最好的纯数学学生吓到了,其中一些学生后来赢得了重大数学奖项。1955 年获得本科学位后,卡普加入了哈佛大学数字计算研究生项目,并于 1959 年获得博士学位。从那里他进入 IBM 研究中心,在那里他对各种计算机算法做出了贡献,特别是在组合优化方面。1968 年,他离开 IBM,成为伯克利大学的计算机科学教授,并在那里度过了他职业生涯的大部分时间。
Karp studied mathematics at Harvard, and found discrete math particularly attractive because it drew on his puzzle-solving instincts and also because he was intimidated by the best pure math students in his classes, some of whom went on to win major mathematical prizes. Upon receiving his undergraduate degree in 1955, Karp joined Harvard’s graduate program in digital computing, earning a PhD in 1959. From there he went to IBM Research, where he made contributions to a variety of computer algorithms, especially for combinatorial optimization. In 1968 he left IBM to become a computer science professor at Berkeley, where he has remained for most of his career.
这些传记细节为卡普对库克的反应奠定了基础(1971b,此处为第 34 章)。卡普认识到,许多组合问题与布尔可满足性具有家族相似性,因为它们的解决方案似乎需要在指数级大的搜索空间中搜索证明或证书,而如果找到的话,该证明或证书将相对较小且易于验证。
These biographical details set the stage for Karp’s reaction to Cook (1971b, here chapter 34). Karp recognized that a great many combinatorial problems had a family resemblance to boolean satisfiability, in that their solution seemed to require searching an exponentially large search space for a proof or certificate that would be relatively small, if found, and easily verified.
卡普在这里定义了𝒫、𝒩 𝒫、多项式时间可归约性(他的本科生导师 Hartley Rogers 研究过的可计算归约性版本)和𝒩 𝒫 - 完整性。然后他列出了 21 个问题,这些问题在𝒩 𝒫中都是不言而喻的,布尔可满足性可以直接或间接地还原到这些问题,从而表明它们都是𝒩 𝒫 -完备的。其中许多问题由来已久。卡普的论文表明,除了表面上的差异之外,它们都是同一个问题。很快,数百个其他问题被添加到卡普的列表中,并且𝒫 = 𝒩 𝒫问题成为现代数学中最重要的未解决问题之一。Garey 和 Johnson(1979)是计算机科学家用来查看新问题是否完整的纲要——在这种情况下,他们不再试图寻找有效的精确解决方案,而是转向寻找近似方法、更容易处理的子方法。问题的情况等
Karp here defines 𝒫, 𝒩𝒫, polynomial-time reducibility (a version of computable reducibility that had been studied by his undergraduate advisor Hartley Rogers), and 𝒩𝒫-completeness. And then he lists 21 problems, all self-evidently in 𝒩𝒫, to which boolean satisfiability is directly or indirectly reducible, thus showing them all to be 𝒩𝒫-complete. Many of these problems had long histories; Karp’s paper showed that they are all the same problem except for superficial variations. In short order hundreds of other problems were added to Karp’s list, and the 𝒫 = 𝒩𝒫 question became one of the most important unsolved problems of modern mathematics. Garey and Johnson (1979) is the compendium to which computer scientists turn to see if a new problem is 𝒩𝒫-complete—in which case they stop trying to find an efficient exact solution and move on to seek approximate methods, more tractable sub-cases of the problem, etc.
一大类计算问题涉及图、有向图、整数、整数数组、有限集的有限族、布尔公式和其他可数域的元素的属性的确定。通过将这些领域的简单编码转换为有限字母表上的单词集,这些问题可以转换为语言识别问题,并且我们可以探究它们的计算复杂性。当找到一种算法,该算法在由输入长度的多项式界定的多个步骤内终止时,可以合理地认为该问题已得到圆满解决。我们证明,覆盖、匹配、打包、路由、分配和排序等大量经典的未解决问题是等价的,从某种意义上说,它们要么都拥有多项式有界算法,要么都不拥有。
A large class of computational problems involve the determination of properties of graphs, digraphs, integers, arrays of integers, finite families of finite sets, boolean formulas and elements of other countable domains. Through simple encodings from such domains into the set of words over a finite alphabet these problems can be converted into language recognition problems, and we can inquire into their computational complexity. It is reasonable to consider such a problem satisfactorily solved when an algorithm for it is found which terminates within a number of steps bounded by a polynomial in the length of the input. We show that a large number of classic unsolved problems of covering, matching, packing, routing, assignment and sequencing are equivalent, in the sense that either each of them possesses a polynomial-bounded algorithm or none of them does.
目前已知的用于计算图的色数、确定图是否具有哈密顿回路或求解变量被限制为 0 或 1 的线性不等式系统的所有通用方法都需要组合搜索,其中最坏情况下的时间要求随着输入的长度呈指数增长。在本文中,我们给出的定理强烈表明,但并不意味着这些问题以及许多其他问题将永远难以解决。
All the general methods presently known for computing the chromatic number of a graph, deciding whether a graph has a Hamilton circuit, or solving a system of linear inequalities in which the variables are constrained to be 0 or 1, require a combinatorial search for which the worst case time requirement grows exponentially with the length of the input. In this paper we give theorems which strongly suggest, but do not imply, that these problems, as well as many others, will remain intractable perpetually.
我们特别感兴趣的是是否存在保证在输入长度多项式所限制的多个步骤中终止的算法。我们展示了一类众所周知的组合问题,包括上面提到的那些问题,它们是等价的,因为对于其中任何一个的多项式有界算法都将有效地为所有问题产生一个多项式有界算法。我们还表明,如果这些问题确实拥有多项式有界算法,那么出乎意料的广泛类别(粗略地说,可通过多项式深度回溯搜索解决的问题类别)中的所有问题都拥有多项式有界算法。
We are specifically interested in the existence of algorithms that are guaranteed to terminate in a number of steps bounded by a polynomial in the length of the input. We exhibit a class of well-known combinatorial problems, including those mentioned above, which are equivalent, in the sense that a polynomial-bounded algorithm for any one of them would effectively yield a polynomial-bounded algorithm for all. We also show that, if these problems do possess polynomial-bounded algorithms then all the problems in an unexpectedly wide class (roughly speaking, the class of problems solvable by polynomial-depth backtrack search) possess polynomial-bounded algorithms.
以下是对论文内容的简要概括。为了明确起见,我们的技术发展是根据单带图灵机识别语言来进行的,但是任何其他各种抽象计算模型都会产生相同的理论。设Σ *为所有有限 0 和 1 字符串的集合。Σ *的子集称为语言。令𝒫为单带确定性图灵机在多项式时间内可识别的语言类,并令𝒩 𝒫为单带非确定性图灵机在多项式时间内可识别的语言类。令Π为可通过单带图灵机在多项式时间内计算的从Σ *到Σ *的函数类。设L和M为语言。如果存在函数f ∈ Π使得f ( x ) ∈ M ⇔ x ∈ L ,我们说L ≼ M(L 可约简为 M ) 。如果M ∈ 𝒫且L ≼ M,则L ∈ 𝒫。如果L ≼ M且M ≼ L ,我们称L和M等价。如果L ∈ 𝒩 𝒫并且𝒩 𝒫中的每种语言都可简化为L ,则称L(多项式)完备。要么所有完整的语言都在𝒫中,要么都没有。前一种选择当且仅当𝒫 = 𝒩 𝒫时成立。
The following is a brief summary of the contents of the paper. For the sake of definiteness our technical development is carried out in terms of the recognition of languages by one-tape Turing machines, but any of a wide variety of other abstract models of computation would yield the same theory. Let Σ* be the set of all finite strings of 0s and 1s. A subset of Σ* is called a language. Let 𝒫 be the class of languages recognizable in polynomial time by one-tape deterministic Turing machines, and let 𝒩𝒫 be the class of languages recognizable in polynomial time by one-tape nondeterministic Turing machines. Let Π be the class of functions from Σ* into Σ* computable in polynomial time by one-tape Turing machines. Let L and M be languages. We say that L ≼ M (L is reducible to M) if there is a function f ∈ Π such that f(x) ∈ M ⇔ x ∈ L. If M ∈ 𝒫 and L ≼ M then L ∈ 𝒫. We call L and M equivalent if L ≼ M and M ≼ L. Call L (polynomial) complete if L ∈ 𝒩𝒫 and every language in 𝒩𝒫 is reducible to L. Either all complete languages are in 𝒫, or none of them are. The former alternative holds if and only if 𝒫 = 𝒩𝒫.
本文的主要贡献是证明数学规划、图论、组合学、计算逻辑和切换理论等领域出现的大量经典困难计算问题在用自然语言表达时是完整的(因此是等价的)方式作为语言识别问题。
The main contribution of this paper is the demonstration that a large number of classic difficult computational problems, arising in fields such as mathematical programming, graph theory, combinatorics, computational logic and switching theory, are complete (and hence equivalent) when expressed in a natural way as language recognition problems.
本文受到 Stephen Cook (1971b) 工作的启发,并以他论文中出现的一个重要定理为基础。作者还要感谢尤金·劳勒 (Eugene Lawler) 和罗伯特·塔里安 (Robert Tarjan) 的重大贡献。
This paper was stimulated by the work of Stephen Cook (1971b), and rests on an important theorem which appears in his paper. The author also wishes to acknowledge the substantial contributions of Eugene Lawler and Robert Tarjan.
有一大类重要的计算问题涉及图、有向图、整数、有限集的有限族、布尔公式和其他可数域的元素的属性的确定。这是一个合理的工作假设,最初由 Jack Edmonds (1965) 在图论和整数规划问题中提出,现在已被广泛接受,即当且仅当存在一种算法其解决方案的运行时间受输入大小的多项式限制。……
There is a large class of important computational problems which involve the determination of properties of graphs, digraphs, integers, finite families of finite sets, boolean formulas and elements of other countable domains. It is a reasonable working hypothesis, championed originally by Jack Edmonds (1965) in connection with problems in graph theory and integer programming, and by now widely accepted, that such a problem can be regarded as tractable if and only if there is an algorithm for its solution whose running time is bounded by a polynomial in the size of the input. …
我们通过列出可在多项式时间内解决的问题样本来完成本节。在下一节中,我们将研究这些问题的一些近亲,这些问题在多项式时间内无法解决。附录 1 建立了我们的符号。每个问题都通过给出(在标题“INPUT ”下)其定义域的通用元素和(在标题“PROPERTY ”下)导致输入被接受的属性来指定。
We complete this section by listing a sampling of problems which are solvable in polynomial time. In the next section we examine a number of close relatives of these problems which are not known to be solvable in polynomial time. Appendix 1 establishes our notation. Each problem is specified by giving (under the heading “INPUT”) a generic element of its domain of definition and (under the heading “ PROPERTY”) the property which causes an input to be accepted.
每个子句最多2个文字的可满足性(Cook,1971b )
SATISFIABILITY WITH AT MOST 2 LITERALS PER CLAUSE (Cook, 1971b)
I NPUT:子句C 1、C 2、…、C p,每个子句最多包含 2 个文字
INPUT: Clauses C1, C2, …, Cp, each containing at most 2 literals
属性:给定子句的合取是可满足的;即,有一个集合
PROPERTY: The conjunction of the given clauses is satisfiable; i.e., there is a set
这样
such that
a) S不包含互补的文字对并且
a) S does not contain a complementary pair of literals and
b) S ∩ C k ≠ ∅, k = 1, 2 , … , p。
b) S ∩ Ck ≠ ∅, k = 1, 2, …, p.
最小生成树(Kruskal,1956,此处第17章)
MINIMUM SPANNING TREE (Kruskal, 1956, here chapter 17)
输入:G、W、W _
INPUT: G, w, W
性质:存在一棵权重≤W的生成树。
PROPERTY: There exists a spanning tree of weight ≤ W.
最短路径(Dijkstra,1959)
SHORTEST PATH (Dijkstra, 1959)
输入: G、 w 、W、s、t
INPUT: G, w, W, s, t
属性: s和t之间存在一条权重≤ W的路径。
PROPERTY: There is a path between s and t of weight ≤ W.
最小切割(Edmonds 和 Karp , 1972 )
MINIMUM CUT (Edmonds and Karp, 1972)
输入: G、 w 、W、s、t
INPUT: G, w, W, s, t
属性:存在s, t重量切割 ≤ W。……
PROPERTY: There is an s, t cut of weight ≤ W. …
在本节中,我们陈述了 Cook (1971b) 的一个重要定理,该定理断言某个宽类𝒩 𝒫中的任何语言都可以简化为特定集合S,这对应于确定合取范式形式的布尔公式是否为的问题可以满足的。
In this section we state an important theorem due to Cook (1971b) which asserts that any language in a certain wide class 𝒩𝒫 is reducible to a specific set S, which corresponds to the problem of deciding whether a boolean formula in conjunctive normal form is satisfiable.
令𝒫 (2)表示可在多项式时间内识别的Σ * × Σ *的子集类。给定L (2) ∈ 𝒫 (2)和多项式p,我们定义语言L如下:
Let 𝒫(2) denote the class of subsets of Σ*× Σ* which are recognizable in polynomial time. Given L(2) ∈ 𝒫(2) and a polynomial p, we define a language L as follows:
我们将L称为通过p有界存在量化从L (2)导出的语言。
We refer to L as the language derived from L(2) by p-bounded existential quantification.
定义 𝒩 𝒫是通过多项式有界存在量化从𝒫 (2)的元素导出的语言集合。
Definition 𝒩𝒫 is the set of languages derived from elements of 𝒫(2) by polynomial-bounded existential quantification.
在非确定性图灵机方面,𝒩𝒫还有另一种表征。……
There is an alternative characterization of 𝒩𝒫 in terms of nondeterministic Turing machines. …
课程𝒩𝒫非常广泛。松散地讲,当且仅当可以通过多项式有界深度的回溯搜索来解决时,识别问题才属于𝒩 𝒫 。𝒫中不存在的大量重要计算问题显然都在𝒩 𝒫中。例如,考虑确定图G的节点是否可以用k 种颜色着色以便没有两个相邻节点具有该颜色的问题。非确定性算法可以简单地猜测节点的颜色分配,然后检查(在多项式时间内)所有相邻节点对是否具有不同的颜色。
The class 𝒩𝒫 is very extensive. Loosely, a recognition problem is in 𝒩𝒫 if and only if it can be solved by a backtrack search of polynomial bounded depth. A wide range of important computational problems which are not known to be in 𝒫 are obviously in 𝒩𝒫. For example, consider the problem of determining whether the nodes of a graph G can be colored with k colors so that no two adjacent nodes have the color. A nondeterministic algorithm can simply guess an assignment of colors to the nodes and then check (in polynomial time) whether all pairs of adjacent nodes have distinct colors.
鉴于𝒩 𝒫的广泛范围,库克提出的以下定理是值得注意的。我们将可满足性问题定义如下:
In view of the wide extent of 𝒩𝒫, the following theorem due to Cook is remarkable. We define the satisfiability problem as follows:
可满足性
SATISFIABILITY
INPUT :条款C 1、C 2 、…、C p
INPUT: Clauses C1, C2, …, Cp
属性:给定子句的合取是可满足的;即,有一个集合
PROPERTY: The conjunction of the given clauses is satisfiable; i.e., there is a set
这样
such that
a) S不包含互补的文字对并且
a) S does not contain a complementary pair of literals and
b) S ∩ C k ≠ ∅, k = 1, 2 , … , p。
b) S ∩ Ck ≠ ∅, k = 1, 2, …, p.
定理 2(库克)。如果L ∈ 𝒩 𝒫则L ≼可满足性。
Theorem 2 (Cook). If L ∈ 𝒩𝒫 then L ≼ SATISFIABILITY.
Cook (1971b) 提出的定理使用了比这里使用的更弱的可还原性概念,但库克的证明支持了当前的陈述。
The theorem stated by Cook (1971b) uses a weaker notion of reducibility than the one used here, but Cook’s proof supports the present statement.
推论1。𝒫 = 𝒩 𝒫 ⇔满足感∈ 𝒫。
Corollary 1. 𝒫 = 𝒩𝒫⇔ SATISFIABILITY ∈ 𝒫.
证明。如果SATISFIABILITY ∈ 𝒫 ,则对于每个L ∈ 𝒩 𝒫,L ∈ 𝒫,因为L ≼ SATISFIABILITY 。如果SATISFIABILITY 𝒫,那么显然SATISFIABILITY ∈ 𝒩 𝒫,𝒫 ≠ 𝒩 𝒫。
Proof. If SATISFIABILITY ∈ 𝒫, then for each L ∈ 𝒩𝒫, L ∈ 𝒫, since L ≼ SATISFIABILITY. If SATISFIABILITY 𝒫, then since clearly SATISFIABILITY ∈ 𝒩𝒫, 𝒫 ≠ 𝒩𝒫.
评论。如果𝒫 = 𝒩 𝒫,则𝒩 𝒫在补集和多项式有界存在量化下是封闭的。因此,它在多项式有界通用量化下也是封闭的。由此可见,如果𝒫 = 𝒩 𝒫 ,则克莱恩算术层次结构(Rogers,1967)的多项式有界类似物变得微不足道。
Remark. If 𝒫 = 𝒩𝒫 then 𝒩𝒫 is closed under complementation and polynomial-bounded existential quantification. Hence it is also closed under polynomial-bounded universal quantification. It follows that a polynomial-bounded analogue of Kleene’s Arithmetic Hierarchy (Rogers, 1967) becomes trivial if 𝒫 = 𝒩𝒫.
定理 2 表明,如果存在多项式时间算法来决定可满足性中的隶属度,那么可由多项式深度回溯搜索解决的每个问题也可由多项式时间算法解决。这是可满足性 𝒫的强有力的间接证据。
Theorem 2 shows that, if there were a polynomial-time algorithm to decide membership in SATISFIABILITY then every problem solvable by a polynomial-depth backtrack search would also be solvable by a polynomial-time algorithm. This is strong circumstantial evidence that SATISFIABILITY 𝒫.
本文的主要目的是确定大量重要的计算问题可以发挥库克定理中可满足性的作用。这样的问题就称为完整的。
The main object of this paper is to establish that a large number of important computational problems can play the role of SATISFIABILITY in Cook’s theorem. Such problems will be called complete.
定义 5.语言L是(多项式)完备的,如果
Definition 5. The language L is (polynomial) complete if
a) L ∈ 𝒩 𝒫且
a) L ∈ 𝒩𝒫 and
b) 可满足性 ≼ L。
b) SATISFIABILITY ≼ L.
定理3。要么所有完整的语言都在𝒫中,要么都没有。前一种选择当且仅当𝒫 = 𝒩 𝒫时成立。……
Theorem 3. Either all complete languages are in 𝒫, or none of them are. The former alternative holds if and only if 𝒫 = 𝒩𝒫. …
论文的其余部分主要致力于以下定理的证明。
The rest of the paper is mainly devoted to the proof of the following theorem.
主要定理。以下列表中的所有问题均已完成。
Main theorem. All the problems on the following list are complete.
1.可满足性
1. SATISFIABILITY
评论:根据对偶性,这个问题等价于确定析取范式表达式是否是同义反复。
COMMENT: By duality, this problem is equivalent to determining whether a disjunctive normal form expression is a tautology.
2. 0-1整数编程
2. 0-1 INTEGER PROGRAMMING
I NPUT:整数矩阵C和整数向量d
INPUT: integer matrix C and integer vector d
性质:存在一个 0-1向量x使得Cx = d。
PROPERTY: There exists a 0-1 vector x such that Cx = d.
3.利克
3. CLIQUE
I NPUT:图G,正整数k
INPUT: graph G, positive integer k
属性:G有一组k 个相互相邻的节点。
PROPERTY: G has a set of k mutually adjacent nodes.
4.套装包装_
4. SET PACKING
I NPUT:集合族 { S j },正整数ℓ
INPUT: Family of sets {Sj}, positive integer ℓ
属性:{ Sj }包含ℓ个互不相交的集合。
PROPERTY: {Sj} contains ℓ mutually disjoint sets.
5.节点盖_
5. NODE COVER
I NPUT:图G ′,正整数ℓ
INPUT: graph G′, positive integer ℓ
属性:存在一个集合R ⊆ N ′ 使得| 右| ≤ ℓ并且每条弧都与R中的某个节点相关。
PROPERTY: There is a set R ⊆ N′ such that |R| ≤ ℓ and every arc is incident with some node in R.
6.布景_ _
6. SET COVERING
I NPUT:有限集的有限族 { S j },正整数k
INPUT: finite family of finite sets {Sj}, positive integer k
性质:存在一个子族 { T h } ⊆ { S j } 包含 ≤ k个集合,使得⋃ T h = ⋃ S j。
PROPERTY: There is a subfamily {Th} ⊆ {Sj} containing ≤ k sets such that ⋃ Th = ⋃ Sj.
7.反馈节点设置_ _
7. FEEDBACK NODE SET
I NPUT:有向图H,正整数k
INPUT: digraph H, positive integer k
属性:存在一个集合R ⊆ V,使得H的每个(有向)循环都包含R中的一个节点。
PROPERTY: There is a set R ⊆ V such that every (directed) cycle of H contains a node in R.
8.反馈RC设置_ _
8. FEEDBACK ARC SET
I NPUT:有向图H,正整数k
INPUT: digraph H, positive integer k
性质:存在一个集合S ⊆ E,使得H的每个(有向)循环都包含S中的弧。
PROPERTY: There is a set S ⊆ E such that every (directed) cycle of H contains an arc in S.
9.汉密尔顿有向电路_
9. DIRECTED HAMILTON CIRCUIT
输入:有向图H
INPUT: digraph H
属性:H有一个有向循环,其中每个节点恰好包含一次。
PROPERTY: H has a directed cycle which includes each node exactly once.
10.无向汉密尔顿电路_
10. UNDIRECTED HAMILTON CIRCUIT
输入:图G
INPUT: graph G
属性:G有一个循环,其中每个节点恰好包含一次。
PROPERTY: G has a cycle which includes each node exactly once.
11.每个子句最多3个文字的可满足性
11. SATISFIABILITY WITH AT MOST 3 LITERALS PER CLAUSE
I NPUT:子句D 1、D 2、…、D r,每个子句最多包含集合中的 3 个文字
INPUT: Clauses D1, D2, …, Dr, each consisting of at most 3 literals from the set
性质:集合{ D 1 , D 2 , … , D r } 是可满足的。
PROPERTY: The set {D1, D2, …, Dr} is satisfiable.
12.色数_
12. CHROMATIC NUMBER
I NPUT:图G,正整数k
INPUT: graph G, positive integer k
性质:存在函数phi : N → Z k,如果u和v相邻,则phi ( u ) ≠ phi ( v )。
PROPERTY: There is a function ϕ: N → Zk such that, if u and v are adjacent, then ϕ(u) ≠ ϕ(v).
13.团体封面_
13. CLIQUE COVER
I NPUT:图G ′,正整数ℓ
INPUT: graph G′, positive integer ℓ
性质:N ′是ℓ或更少派系的并集。
PROPERTY: N′ is the union of ℓ or fewer cliques.
14.精确封面_
14. EXACT COVER
I NPUT:集合{ u i , i = 1, 2 , … , t } 的子集族{ S j }
INPUT: family {Sj} of subsets of a set {ui, i = 1, 2, …, t}
性质:存在一个子族 { T h } ⊆ { S j },使得集合T h不相交且⋃ T h = ⋃ S j = { u i , i = 1, 2 , … , t }。
PROPERTY: There is a subfamily {Th} ⊆ {Sj} such that the sets Th are disjoint and ⋃ Th = ⋃ Sj = {ui, i = 1, 2, …, t}.
15.击球设置_
15. HITTING SET
I NPUT: { s j , j = 1, 2 , … , r }子集的族 { U i }
INPUT: family {Ui} of subsets of {sj, j = 1, 2, …, r}
属性:有一个集合W ,对于每个i,| W ∩ U我| = 1。
PROPERTY: There is a set W such that, for each i, |W ∩ Ui| = 1.
16.斯坦纳树_
16. STEINER TREE
I NPUT:图G,R ⊆ N,加权函数w:A → Z,正整数k
INPUT: graph G, R ⊆ N, weighting function w: A → Z, positive integer k
属性:G有一个权重 ≤ k的子树,其中包含R中的节点集。
PROPERTY: G has a subtree of weight ≤ k containing the set of nodes in R.
17. 3- D维度匹配
17. 3-DIMENSIONAL MATCHING
I NPUT:集合U ⊆ T × T × T,其中T是有限集
INPUT: set U ⊆ T × T × T, where T is a finite set
属性:存在一个集合W ⊆ U使得| 西| = | T | 并且W中没有两个元素在任何坐标上一致。
PROPERTY: There is a set W ⊆ U such that |W| = |T| and no two elements of W agree in any coordinate.
18.K背包
18. KNAPSACK
I输入: ( a 1 , a 2 , … , a n , b ) ∈ Z n +1
INPUT: (a1, a2, …, an, b) ∈ Zn+1
性质: Σ a j x j = b有 0-1 解。
PROPERTY: ∑ ajxj = b has a 0-1 solution.
19.作业排序_
19. JOB SEQUENCING
I NPUT : “执行时间向量” ( T l , T p ) ∈ Z p , “截止时间向量” ( D 1 , … , D p ) ∈ Z p , “惩罚向量” ( P 1 , … , P p ) ε Z p,正整数k
INPUT: “execution time vector” (Tl, Tp) ∈ Zp, “deadline vector” (D1, …, Dp) ∈ Zp, “penalty vector” (P1, …, Pp) ∈ Zp, positive integer k
属性:存在{1, 2 , … , p } 的排列π使得
PROPERTY: There is a permutation π of {1, 2, …, p} such that
20.分区
20. PARTITION
输入:( c 1 , c 2 , … , c s ) ∈ Z s
INPUT: (c1, c2, …, cs) ∈ Zs
性质:存在一个集合I ⊆ { 1, 2 , … , s } 使得。
PROPERTY: There is a set I ⊆{1, 2, …, s} such that .
21.最大剪切_
21. MAX CUT
I NPUT:图G,加权函数w:A → Z,正整数W
INPUT: graph G, weighting function w: A → Z, positive integer W
属性:存在一个集合S ⊆ N使得
PROPERTY: There is a set S ⊆ N such that
很明显,这些问题(或者更准确地说,它们编码为Σ *)都在𝒩 𝒫中。我们继续给出一系列显式的简化,表明可满足性可以简化为列出的每个问题。图 36.1显示了归约集的结构。图中的每一行都表示将上面的问题简化为下面的问题。
It is clear that these problems (or, more precisely, their encodings into Σ*) are all in 𝒩𝒫. We proceed to give a series of explicit reductions, showing that SATISFIABILITY is reducible to each of the problems listed. Figure 36.1 shows the structure of the set of reductions. Each line in the figure indicates a reduction of the upper problem to the lower one.
图 36.1: 完整的问题
Figure 36.1: Complete problems
为了展示集合 ⊆ D到集合T ′⊆ D ′ 的约简,我们指定一个函数F : D → D ′ ,它满足引理 2 的条件。在每种情况下,读者应该不难验证F确实满足这些条件。
To exhibit a reduction of a set ⊆ D to a set T′⊆ D′ we specify a function F: D → D′ which satisfies the conditions of Lemma 2. In each case, the reader should have little difficulty in verifying that F does satisfy these conditions.
可满足 性≼ 0-1整数编程
SATISFIABILITY ≼ 0-1 INTEGER PROGRAMMING
满足感≼ C LIQUE _
SATISFIABILITY ≼ CLIQUE
C LIQUE ≼套装_ _ _
CLIQUE ≼ SET PACKING
假设N = {1, 2 , … , n }。集合S 1 , S 2 , … , S n的元素是不在A中的节点 { i, j }的二元素集。
Assume N = {1, 2, …, n}. The elements of the sets S1, S2, …, Sn are those two-element sets of nodes {i, j} not in A.
最后,我们列出了𝒩 𝒫中的以下重要问题,但这些问题尚不完整。[编辑:尽管 László Babai (2016) 在这个问题上取得了重大进展,但G RAPH I SOMORPHISM的复杂性仍然未知。但现在已知 N ONPRIMES ε 𝒫 (Agrawal et al., 2004) 和线性不等式ε 𝒫 ( Khachiyan, 1979)。]
We conclude by listing the following important problems in 𝒩𝒫 which are not known to be complete. [EDITOR: The complexity of GRAPH ISOMORPHISM remains unknown, though László Babai (2016) has made significant progress on this problem. But it is now known that NONPRIMES ∈ 𝒫 (Agrawal et al., 2004) and LINEAR INEQUALITIES ∈ 𝒫 (Khachiyan, 1979).]
图I立体主义
GRAPH ISOMORPHISM
I NPUT:图G和G ′
INPUT: graphs G and G′
性质:G与G '同构。
PROPERTY: G is isomorphic to G′.
非质数
NONPRIMES
INPUT :正整数k
INPUT: positive integer k
性质:k是复合的。
PROPERTY: k is composite.
线性I不等式_
LINEAR INEQUALITIES
I NPUT:整数矩阵C,整数向量d
INPUT: integer matrix C, integer vector d
性质:Cx ≥ d有一个有理解。
PROPERTY: Cx ≥ d has a rational solution.
……
…
经 Springer 许可,转载自 Karp (1972)。
Reprinted from Karp (1972), with permission from Springer.
兼容的分时系统(第 23 章)仅在麻省理工学院使用,但激发了对分时系统的大量研究和开发。CTSS 的后继者是 M ULTICS,MIT 曾尝试将其商业化。贝尔实验室是参与开发MULTICS的私营企业集团之一,随着电话系统的自动化,该实验室正在积累计算机系统方面的专业知识。最终贝尔实验室退出了 M ULTICS项目,一些参与该项目的研究人员决定构建一个自己的更简单、更精简的操作系统,主要是为了让他们更容易编写和调试代码。U NIX 最初设计为在小型计算机上作为单用户系统运行(因此得名,是“ MULTICS ”的起飞),随着最初的开发人员及其同事在贝尔实验室发现了它的更多用途, U NIX不断发展壮大。
The Compatible Time-Sharing System (chapter 23) was used only at MIT, but inspired a great deal of research and development on time-sharing systems. CTSS was succeeded by MULTICS, which MIT made some attempt to commercialize. One of the private-industry groups involved in developing MULTICS was at Bell Labs, which was building expertise in computer systems as the telephone system became automated. Ultimately Bell Labs withdrew from the MULTICS project, and a few of the researchers who had been involved in it decided to build a simpler, leaner operating system of their own, mostly to make it easier for them to write and debug code. Originally designed to run as a single-user system on a small computer (hence the name, a take-off on “MULTICS”), UNIX grew as the original developers, and their colleagues, found more uses for it at Bell Labs.
贝尔实验室授权 U NIX在公司外部使用,向大学收取极低的费用。其结果是使 UNIX 成为计算机科学研究部门事实上的标准。一代又一代的顶尖技术人才在毕业时都了解其结构并喜爱其模块化、灵活、无阻碍的设计。UNIX团队开发了 C 编程语言来有效地实现它(最初的实现是用汇编语言),这种低级高级语言成为系统编程的主要内容。UNIX的分层文件结构(UNIX继承自 M ULTICS)也产生了广泛的影响。最终,UNIX成为 Berkeley Software Distribution (BSD)、Richard Stallman 的 GNU(“GNU 不是UNIX ”)、Linus Torvalds 的 Linux 和 Apple 的 Mac OS 的基础。
Bell Labs licensed UNIX for use outside the company, charging very low fees to universities. The effect was to make UNIX the de facto standard in computer science research departments. Generations of top technical talent graduated with knowledge of its structure and fondness for its modular, flexible, unencumbered design. The team working on UNIX developed the C programming language to implement it effectively (the original implementation was in assembly language), and this low-level higher-level language became a systems programming staple. UNIX’s hierarchical file structure (which UNIX inherited from MULTICS) has also been widely influential. Ultimately UNIX became the basis for the Berkeley Software Distribution (BSD), Richard Stallman’s GNU (“GNU’s not UNIX”), Linus Torvalds’s Linux, and Apple’s Mac OS.
丹尼斯·里奇(Dennis Ritchie,1941-2011)是贝尔实验室科学家的儿子。他在哈佛大学攻读物理学,并作为研究生学习计算机科学。1968年,他完成了博士论文并进行了答辩。但当时他正在贝尔实验室工作,未能提交论文的最终版本,因此他从未获得博士学位。里奇的余生都在贝尔实验室工作。肯·汤普森(Ken Thompson,生于 1943 年)毕业于伯克利分校,于 1966 年开始在贝尔实验室工作,一直留到 2000 年退休,然后转到 Google。Ritchie 和 Thompson 于 1983 年荣获图灵奖——距离这份 UNIX 描述发表还不到十年。
Dennis Ritchie (1941–2011) was the son of a Bell Labs scientist. He studied physics at Harvard and computer science as a graduate student. He wrote and defended his PhD dissertation in 1968. But by then he was working at Bell Labs and failed to turn in the final version of the thesis, so he never received his PhD. Ritchie worked at Bell Labs for the remainder of his career. Ken Thompson (b. 1943), a Berkeley graduate, started at Bell Labs in 1966 and remained there until he retired in 2000, when he moved to Google. Ritchie and Thompson were recognized with the Turing Award in 1983—less than a decade after the publication of this description of UNIX.
UNIX是一种通用、多用户、交互式操作系统,适用于 Digital Equipment Corporation (DEC) PDP-11/40 和 11/45 计算机。它提供了许多即使在大型操作系统中也很少见的功能,包括:(1)包含可拆卸卷的分层文件系统;(2) 兼容的文件、设备和进程间I/O;(3) 启动异步进程的能力;(4) 系统命令语言可供每个用户选择;(5) 100多个子系统,包括十几种语言。本文讨论了文件系统和用户命令接口的性质和实现。
UNIX is a general-purpose, multi-user, interactive operating system for the Digital Equipment Corporation (DEC) PDP-11/40 and 11/45 computers. It offers a number of features seldom found even in larger operating systems, including: (1) a hierarchical file system incorporating demountable volumes; (2) compatible file, device, and inter-process I/O; (3) the ability to initiate asynchronous processes; (4) system command language selectable on a per-user basis; and (5) over 100 subsystems including a dozen languages. This paper discusses the nature and implementation of the file system and of the user command interface.
U NIX已经出现了三个版本。最早的版本(大约 1969-70 年)在 Digital Equipment Corporation PDP-7 和 -9 计算机上运行。第二个版本在未受保护的 PDP-11/20 计算机上运行。[编辑:也就是说,计算机缺乏支持分时的内存保护机制。]本文仅描述 PDP-11/40 和 /45(1972 年 12 月)系统,因为它更现代,并且之间存在许多差异。它和旧的 UNIX系统是由于重新设计发现缺陷或缺乏的功能而产生的。
There have been three versions of UNIX. The earliest version (circa 1969–70) ran on the Digital Equipment Corporation PDP-7 and -9 computers. The second version ran on the unprotected PDP-11/20 computer. [EDITOR: That is, the computer lacked memory protection mechanisms to support time-sharing.] This paper describes only the PDP-11/40 and /45 (DEC, 1972) system since it is more modern and many of the differences between it and older UNIX systems result from redesign of features found to be deficient or lacking.
自 1971 年 2 月 PDP-11 UNIX投入运行以来,已有约 40 个装置投入使用;它们通常比此处描述的系统小。他们中的大多数从事诸如专利申请和其他文本材料的准备和格式化、从贝尔系统内的各种交换机收集和处理故障数据以及记录和检查电话服务订单等应用。我们自己的装置主要用于操作系统、语言、计算机网络和计算机科学其他主题的研究,以及文档准备。
Since PDP-11 UNIX became operational in February 1971, about 40 installations have been put into service; they are generally smaller than the system described here. Most of them are engaged in applications such as the preparation and formatting of patent applications and other textual material, the collection and processing of trouble data from various switching machines within the Bell System, and recording and checking telephone service orders. Our own installation is used mainly for research in operating systems, languages, computer networks, and other topics in computer science, and also for document preparation.
也许UNIX最重要的成就是证明了一个强大的交互式操作系统不需要昂贵的设备或人力:UNIX可以在成本低至 40,000 美元的硬件上运行,并且不到两个人年的时间。花费在主要系统软件上。然而,UNIX包含许多即使在更大的系统中也很少提供的功能。然而,我们希望 U NIX的用户会发现该系统最重要的特点是它的简单、优雅和易用。……
Perhaps the most important achievement of UNIX is to demonstrate that a powerful operating system for interactive use need not be expensive either in equipment or in human effort: UNIX can run on hardware costing as little as $40,000, and less than two man years were spent on the main system software. Yet UNIX contains a number of features seldom offered even in much larger systems. It is hoped, however, the users of UNIX will find that the most important characteristics of the system are its simplicity, elegance, and ease of use.…
……
…
U NIX最重要的工作就是提供文件系统。从用户的角度来看,文件分为三种:普通磁盘文件、目录和特殊文件。
The most important job of UNIX is to provide a file system. From the point of view of the user, there are three kinds of files: ordinary disk files, directories, and special files.
系统维护几个目录供其自己使用。其中之一是根目录。可以通过跟踪目录链中的路径来找到系统中的所有文件,直到到达所需的文件。此类搜索的起点通常是根。另一个系统目录包含所有提供给一般用途的程序;即所有命令。然而,正如将要看到的,程序并不需要驻留在该目录中才能被执行。
The system maintains several directories for its own use. One of these is the root directory. All files in the system can be found by tracing a path through a chain of directories until the desired file is reached. The starting point for such searches is often the root. Another system directory contains all the programs provided for general use; that is, all the commands. As will be seen however, it is by no means necessary that a program reside in this directory for it to be executed.
文件按 14 个或更少字符的序列命名。当向系统指定文件名时,它可以是路径名的形式,路径名是由斜杠“/”分隔并以文件名结尾的目录名序列。如果序列以斜线开头,则搜索从根目录开始。名称/alpha/beta/gamma使系统在根目录中搜索alpha目录,然后在alpha中搜索beta ,最后在beta中查找gamma。Gamma可以是普通文件、目录或特殊文件。作为限制情况,名称“/”指的是根本身。
Files are named by sequences of 14 or fewer characters. When the name of a file is specified to the system, it may be in the form of a path name, which is a sequence of directory names separated by slashes “/” and ending in a file name. If the sequence begins with a slash, the search begins in the root directory. The name /alpha/beta/gamma causes the system to search the root for directory alpha, then to search alpha for beta, finally to find gamma in beta. Gamma may be an ordinary file, a directory, or a special file. As a limiting case, the name “/” refers to the root itself.
不以“/”开头的路径名会导致系统在用户的当前目录中开始搜索。因此,名称alpha/beta指定当前目录的子目录alpha中名为beta的文件。最简单的名称,例如alpha,指的是在当前目录中找到的文件本身。作为另一个限制情况,空文件名指的是当前目录。
A path name not starting with “/” causes the system to begin the search in the user’s current directory. Thus, the name alpha/beta specifies the file named beta in subdirectory alpha of the current directory. The simplest kind of name, for example alpha, refers to a file which itself is found in the current directory. As another limiting case, the null file name refers to the current directory.
相同的非目录文件可能以不同的名称出现在多个目录中。此功能称为链接;文件的目录条目有时称为链接。UNIX与允许链接的其他系统不同,因为所有到文件的链接都具有相同的状态。也就是说,特定目录中不存在文件;文件的目录项仅由文件名和指向实际描述该文件的信息的指针组成。因此,文件独立于任何目录条目而存在,尽管实际上文件会连同其最后一个链接一起消失。
The same nondirectory file may appear in several directories under possibly different names. This feature is called linking; a directory entry for a file is sometimes called a link. UNIX differs from other systems in which linking is permitted in that all links to a file have equal status. That is, a file does not exist within a particular directory; the directory entry for a file consists merely of its name and a pointer to the information actually describing the file. Thus a file exists independently of any directory entry, although in practice a file is made to disappear along with the last link to it.
每个目录始终至少有两个条目。每个目录中的名称指的是目录本身。因此,程序可以读取名为“.”的当前目录。不知道其完整路径名。按照惯例,名称“..”指的是它出现的目录的父目录,即创建它的目录。
Each directory always has at least two entries. The name in each directory refers to the directory itself. Thus a program may read the current directory under the name “.” without knowing its complete path name. The name “..” by convention refers to the parent of the directory in which it appears, that is, to the directory in which it was created.
目录结构被限制为有根树的形式。特殊条目“.”除外。和“..”,每个目录必须完全作为另一个目录中的一个条目出现,这是它的父母。这样做的原因是为了简化访问目录结构子树的程序的编写,更重要的是,避免层次结构各部分的分离。如果允许任意链接到目录,则很难检测到从根目录到目录的最后一个连接何时被切断。
The directory structure is constrained to have the form of a rooted tree. Except for the special entries “.” and “..”, each directory must appear as an entry in exactly one other, which is its parent. The reason for this is to simplify the writing of programs which visit subtrees of the directory structure, and more important, to avoid the separation of portions of the hierarchy. If arbitrary links to directories were permitted, it would be quite difficult to detect when the last connection from the root to a directory was severed.
以这种方式处理 I/O 设备有三个优点: 文件和设备 I/O 尽可能相似;文件名和设备名具有相同的语法和含义,因此需要文件名作为参数的程序可以传递设备名;最后,特殊文件与常规文件受到相同的保护机制。
There is a threefold advantage in treating I/O devices this way: file and device I/O are as similar as possible; file and device names have the same syntax and meaning, so that a program expecting a file name as a parameter can be passed a device name; finally, special files are subject to the same protection mechanism as regular files.
如果第七位打开,则每当文件作为程序执行时,系统都会暂时将当前用户的用户标识更改为文件创建者的用户标识。用户 ID 的此更改仅在调用它的程序执行期间有效。设置用户 ID 功能提供了可能使用其他用户无法访问的文件的特权程序。例如,程序可能保存一个会计文件,除了程序本身之外,该文件不应被读取或更改。如果程序的设置用户标识位打开,则它可以访问该文件,尽管给定程序的用户调用的其他程序可能会禁止这种访问。由于任何程序的调用者的实际用户 ID 始终可用,因此设置用户 ID 程序可以采取任何所需的措施来满足其调用者的凭据。该机制用于允许用户执行精心编写的调用特权系统条目的命令。例如,有一个系统条目只能由“超级用户”(如下)调用,它创建一个空目录。如上所述,目录应包含“.”条目。和 ”..”。创建目录的命令由超级用户所有,并且设置了 set-user-ID 位。在检查其调用者创建指定目录的授权后,它会创建该目录并为“.”创建条目。和 ”..”。
If the seventh bit is on, the system will temporarily change the user identification of the current user to that of the creator of the file whenever the file is executed as a program. This change in user ID is effective only during the execution of the program which calls for it. The set-user-ID feature provides for privileged programs which may use files inaccessible to other users. For example, a program may keep an accounting file which should neither be read nor changed except by the program itself. If the set-user-identification bit is on for the program, it may access the file although this access might be forbidden to other programs invoked by the given program’s user. Since the actual user ID of the invoker of any program is always available, set-user-ID programs may take any measures desired to satisfy themselves as to their invoker’s credentials. This mechanism is used to allow users to execute the carefully written commands which call privileged system entries. For example, there is a system entry invocable only by the “super-user” (below) which creates an empty directory. As indicated above, directories are expected to have entries for “.” and “..”. The command which creates a directory is owned by the super user and has the set-user-ID bit set. After it checks its invoker’s authorization to create the specified directory, it creates it and makes the entries for “.” and “..”.
由于任何人都可以在他自己的文件之一上设置 set-user-ID 位,因此这种机制通常无需管理干预即可使用。例如,这种保护方案很容易解决 Aleph-Null (1971) 中提出的 MOO 记账问题。
Since anyone may set the set-user-ID bit on one of his own files, this mechanism is generally available without administrative intervention. For example, this protection scheme easily solves the MOO accounting problem posed in Aleph-Null (1971).
系统将一个特定的用户 ID(“超级用户”的 ID)识别为不受文件访问通常限制的;因此(例如)可以编写程序来转储和重新加载文件系统,而不会受到保护系统的不必要干扰。
The system recognizes one particular user ID (that of the “super-user”) as exempt from the usual constraints on file access; thus (for example) programs may be written to dump and reload the file system without unwanted interference from the protection system.
filep = 打开(名称,标志)
filep = open (name, flag)
名称表示文件的名称。可以给出任意路径名。flag参数指示文件是否要被读取、写入或“更新”,即同时读取和写入。
Name indicates the name of the file. An arbitrary path name may be given. The flag argument indicates whether the file is to be read, written, or “updated”, that is read and written simultaneously.
返回值filep称为文件描述符。它是一个小整数,用于在后续读取、写入或以其他方式操作文件的调用中标识该文件。
The returned value filep is called a file descriptor. It is a small integer used to identify the file in subsequent calls to read, write, or otherwise manipulate it.
要创建新文件或完全重写旧文件,有一个create系统调用,如果给定文件不存在,则创建该文件;如果存在,则将其截断为零长度。Create还会打开新文件进行写入,并且与open一样,返回一个文件描述符。
To create a new file or completely rewrite an old one, there is a create system call which creates the given file if it does not exist, or truncates it to zero length if it does exist. Create also opens the new file for writing and, like open, returns a file descriptor.
文件系统中没有用户可见的锁,也没有对可以打开文件进行读或写的用户数量进行任何限制;尽管当两个用户同时写入文件时,文件内容可能会变得混乱,但实际上不会出现困难。我们认为,在我们的环境中,锁对于防止同一文件的用户之间的干扰既不是必要的,也不是充分的。它们是不必要的,因为我们没有面对由独立进程维护的大型单文件数据库。它们是不够的,因为普通意义上的锁(即阻止一个用户写入另一用户正在读取的文件)无法防止混淆,例如,当两个用户都使用制作文件副本的编辑器编辑文件时正在编辑中。应该说,当两个用户同时进行对同一个文件进行写入、在同一个目录中创建文件或删除彼此打开的文件等不方便的活动时,系统有足够的内部互锁来维持文件系统的逻辑一致性。
There are no user-visible locks in the file system, nor is there any restriction on the number of users who may have a file open for reading or writing; although it is possible for the contents of a file to become scrambled when two users write on it simultaneously, in practice, difficulties do not arise. We take the view that locks are neither necessary nor sufficient, in our environment, to prevent interference between users of the same file. They are unnecessary because we are not faced with large, single-file data bases maintained by independent processes. They are insufficient because locks in the ordinary sense, whereby one user is prevented from writing on a file which another user is reading, cannot prevent confusion when, for example, both users are editing a file with an editor which makes a copy of the file being edited. It should be said that the system has sufficient internal interlocks to maintain the logical consistency of the file system when two users engage simultaneously in such inconvenient activities as writing on the same file, creating files in the same directory or deleting each other’s open files.
除下述情况外,读取和写入都是按顺序进行的。这意味着如果文件中的特定字节是最后写入(或读取)的字节,则下一个 I/O 调用隐式引用第一个字节接下来的字节。对于每个打开的文件,都有一个由系统维护的指针,该指针指示要读取或写入的下一个字节。如果读取或写入n 个字节,则指针前进n 个字节。
Except as indicated below, reading and writing are sequential. This means that if a particular byte in the file was the last byte written (or read), the next I/O call implicitly refers to the first following byte. For each open file there is a pointer, maintained by the system, which indicates the next byte to be read or written. If n bytes are read or written, the pointer advances by n bytes.
文件打开后,可以使用以下调用:
Once a file is open, the following calls may be used:
n = 读取(文件p、缓冲区、计数)
n = read(filep, buffer, count)
n = 写入(文件、缓冲区、计数)
n = write(filep, buffer, count)
在filep指定的文件和buffer指定的字节数组之间传输最多count 个字节。返回值n是实际传输的字节数。在写入情况下,n与count相同,除非出现 I/O 错误或特殊文件上的物理介质末尾等特殊情况;然而,在读取中,n可以毫无错误地小于count。如果读指针距离文件末尾太近,以至于读取 count 个字符会导致读取超出末尾,则仅传输足够的字节以到达文件末尾;此外,类似打字机的设备永远不会返回超过一行的输入。当read调用返回n等于 0 时,表示文件结束。对于磁盘文件,当读指针等于文件的当前大小时,就会发生这种情况。可以通过使用取决于所使用的设备的转义序列从打字机生成文件结尾。
Up to count bytes are transmitted between the file specified by filep and the byte array specified by buffer. The returned value n is the number of bytes actually transmitted. In the write case, n is the same as count except under exceptional conditions like I/O errors or end of physical medium on special files; in a read, however, n may without error be less than count. If the read pointer is so near the end of the file that reading count characters would cause reading beyond the end, only sufficient bytes are transmitted to reach the end of the file; also, typewriter-like devices never return more than one line of input. When a read call returns with n equal to zero, it indicates the end of the file. For disk files this occurs when the read pointer becomes equal to the current size of the file. It is possible to generate an end-of-file from a typewriter by use of an escape sequence which depends on the device used.
写入文件的字节仅影响写指针位置和计数所暗示的字节;文件的其他部分没有改变。如果最后一个字节超出文件末尾,则文件将根据需要增长。
Bytes written on a file affect only those implied by the position of the write pointer and the count; no other part of the file is changed. If the last byte lies beyond the end of the file, the file is grown as needed.
要进行随机(直接访问)I/O,只需将读或写指针移动到文件中的适当位置即可。
To do random (direct access) I/O, it is only necessary to move the read or write pointer to the appropriate location in the file.
位置 = 查找(文件、基址、偏移量)
location = seek(filep, base, offset)
与filep关联的指针被移动到距文件开头、指针当前位置或文件末尾偏移字节的位置,具体取决于base。偏移量可能为负。对于某些设备(例如纸带和打字机),查找调用将被忽略。与指针移动到的文件开头的实际偏移量返回到location中。
The pointer associated with filep is moved to a position offset bytes from the beginning of the file, from the current position of the pointer, or from the end of the file, depending on base. Offset may be negative. For some devices (e.g. paper tape and typewriters) seek calls are ignored. The actual offset from the beginning of the file to which the pointer was moved is returned in location.
正如上面第37.3.2节中提到的,目录项仅包含关联文件的名称和指向文件本身的指针。该指针是一个称为文件i 号(索引号)的整数。当文件被访问时,它的i-number被用作存储在目录所在设备的已知部分中的系统表( i-list )的索引。由此找到的条目(文件的i-node)包含文件的描述,如下所示。
As mentioned in §37.3.2 above, a directory entry contains only a name for the associated file and a pointer to the file itself. This pointer is an integer called the i-number (for index number) of the file. When the file is accessed, its i-number is used as an index into a system table (the i-list) stored in a known part of the device on which the directory resides. The entry thereby found (the file’s i-node) contains the description of the file as follows.
1. 它的主人。
1. Its owner.
2.其保护位。
2. Its protection bits.
3. 文件内容的物理磁盘或磁带地址。
3. The physical disk or tape addresses for the file contents.
4.它的大小。
4. Its size.
5. 最后修改时间
5. Time of last modification
6、文件的链接数,即它在目录中出现的次数。
6. The number of links to the file, that is, the number of times it appears in a directory.
7. 指示文件是否为目录的位。
7. A bit indicating whether the file is a directory.
8. 指示该文件是否为特殊文件的位。
8. A bit indicating whether the file is a special file.
9. 指示文件是“大”还是“小”的位。
9. A bit indicating whether the file is “large” or “small.”
open或create系统调用的目的是通过搜索显式或隐式命名的目录,将用户给出的路径名转换为i 编号。文件打开后,其设备、i 编号和读/写指针将存储在由open或create返回的文件描述符索引的系统表中。因此,在读取或写入文件的后续调用期间提供的文件描述符可以容易地与访问该文件所需的信息相关。
The purpose of an open or create system call is to turn the path name given by the user into an i-number by searching the explicitly or implicitly named directories. Once a file is open, its device, i-number, and read/write pointer are stored in a system table indexed by the file descriptor returned by the open or create. Thus the file descriptor supplied during a subsequent call to read or write the file may be easily related to the information necessary to access the file.
创建新文件时,会为其分配一个 i 节点,并创建一个目录条目,其中包含文件名和 i 节点号。建立到现有文件的链接涉及使用新名称创建目录条目、从原始文件条目复制 i 编号以及递增 i 节点的链接计数字段。删除(删除)文件是通过减少其目录项指定的 i 节点的链接计数并擦除目录项来完成的。如果链接计数降至 0,则文件中的所有磁盘块都将被释放,i 节点将被释放。
When a new file is created, an i-node is allocated for it and a directory entry is made which contains the name of the file and the i-node number. Making a link to an existing file involves creating a directory entry with the new name, copying the i-number from the original file entry, and incrementing the link-count field of the i-node. Removing (deleting) a file is done by decrementing the link-count of the i-node specified by its directory entry and erasing the directory entry. If the link-count drops to 0, any disk blocks in the file are freed and the i-node is deallocated.
所有包含文件系统的固定或可移动磁盘上的空间都被划分为许多 512 字节的块,逻辑地址从 0 到取决于设备的限制。每个文件的 i 节点中有 8 个设备地址的空间。一个小(非特殊)文件适合八个或更少的块;在这种情况下,存储块本身的地址。对于大型(非特殊)文件,八个设备地址中的每一个都可以指向构成文件本身的块的 256 个地址的间接块。这些文件可能大至 8 · 256 · 512 或 1, 048, 576 (2 20 ) 字节。
The space on all fixed or removable disks which contain a file system is divided into a number of 512-byte blocks logically addressed from 0 up to a limit which depends on the device. There is space in the i-node of each file for eight device addresses. A small (nonspecial) file fits into eight or fewer blocks; in this case the addresses of the blocks themselves are stored. For large (nonspecial) files, each of the eight device addresses may point to an indirect block of 256 addresses of blocks constituting the file itself. These files may be as large as 8 · 256 · 512, or 1, 048, 576 (220) bytes.
前述讨论适用于普通文件。当对 i 节点指示其特殊的文件发出 I/O 请求时,最后 7 个设备地址字并不重要,并且该列表被解释为构成内部设备名称的一对字节。这些字节分别指定设备类型和子设备号。设备类型指示哪个系统例程将处理该设备上的 I/O;例如,子设备号选择连接到特定控制器的磁盘驱动器或几个类似的打字机接口之一。
The foregoing discussion applies to ordinary files. When an I/O request is made to a file whose i-node indicates that it is special, the last seven device address words are immaterial, and the list is interpreted as a pair of bytes which constitute an internal device name. These bytes specify respectively a device type and subdevice number. The device type indicates which system routine will deal with I/O on that device; the subdevice number selects, for example, a disk drive attached to a particular controller or one of several similar typewriter interfaces.
在这种环境中, mount系统调用(第37.3.4节)的实现非常简单。Mount维护一个系统表,其参数是mount期间指定的普通文件的 i 号和设备名,其对应的值是指示的特殊文件的设备名。在此表中搜索每个(i-number,设备)对,该对在打开或创建期间扫描路径名时出现;如果找到匹配项,则 i 编号将替换为 1(这是所有文件系统上根目录的 i 编号),并且设备名称将替换为表值。
In this environment, the implementation of the mount system call (§37.3.4) is quite straightforward. Mount maintains a system table whose argument is the i-number and device name of the ordinary file specified during the mount, and whose corresponding value is the device name of the indicated special file. This table is searched for each (i-number, device)-pair which turns up while a path name is being scanned during an open or create; if a match is found, the i-number is replaced by 1 (which is the i-number of the root directory on all file systems), and the device name is replaced by the table value.
对于用户来说,文件的读取和写入似乎都是同步且无缓冲的。也就是说,从读取调用返回后,数据立即可用,相反,在写入之后,用户的工作空间可以被重用。事实上,系统维护着相当复杂的缓冲机制,大大减少了访问文件所需的 I/O 操作数量。假设进行了写入调用,指定传输单个字节。
To the user, both reading and writing of files appear to be synchronous and unbuffered. That is immediately after return from a read call the data are available, and conversely after a write the user’s workspace may be reused. In fact the system maintains a rather complicated buffering mechanism which reduces greatly the number of I/O operations required to access a file. Suppose a write call is made specifying transmission of a single byte.
U NIX将搜索其缓冲区以查看受影响的磁盘块当前是否驻留在核心内存中;如果没有,将从设备中读入。然后受影响的字节在缓冲区中被替换,并在要写入的块列表中创建一个条目。然后可能会从写调用返回,尽管实际的 I/O 可能要稍后才能完成。反之,如果读取的是单个字节,则系统判断该字节所在的辅助存储块是否已经在系统的某个缓冲区中;如果是,则可以立即返回该字节。如果不是,则将该块读入缓冲区并挑选出该字节。
UNIX will search its buffers to see whether the affected disk block currently resides in core memory; if not, it will be read in from the device. Then the affected byte is replaced in the buffer, and an entry is made in a list of blocks to be written. The return from the write call may then take place, although the actual I/O may not be completed until a later time. Conversely, if a single byte is read, the system determines whether the secondary storage block in which the byte is located is already in one of the system’s buffers; if so, the byte can be returned immediately. If not, the block is read into a buffer and the byte picked out.
以 512 字节为单位读取或写入文件的程序比一次读取或写入单个字节的程序有优势,但增益并不是很大;它主要来自于避免系统开销。很少使用或没有大量 I/O 的程序可以相当合理地以它希望的小单元进行读写。
A program which reads or writes files in units of 512 bytes has an advantage over a program which reads or writes a single byte at a time, but the gain is not immense; it comes mainly from the avoidance of system overhead. A program which is used rarely or which does no great volume of I/O may quite reasonably read and write in units as small as it wishes.
i-list 的概念是 UNIX 的一个不寻常的特性。在实践中,这种组织文件系统的方法已被证明非常可靠且易于处理。对于系统本身而言,其优势之一是每个文件都有一个简短、明确的名称,该名称以简单的方式与访问文件所需的保护、寻址和其他信息相关。它还允许使用非常简单且快速的算法来检查文件系统的一致性,例如验证每个设备中包含有用信息的部分和那些可以自由分配的部分是不相交的,并且一起耗尽了设备上的空间。该算法独立于目录层次结构,因为它只需要扫描线性组织的 i-list。同时,i-list 的概念带来了其他文件系统组织中没有的某些特性。例如,存在谁应该为文件占用的空间付费的问题,因为文件的所有目录条目都具有相同的地位。一般来说,向文件所有者收费是不公平的,因为一个用户可以创建文件,另一个用户可以链接到该文件,而第一个用户可以删除该文件。第一个用户仍然是文件的所有者,但应向第二个用户收费。最简单的合理公平算法似乎是在拥有文件链接的用户之间平均分摊费用。当前版本的 UNIX完全不收取任何费用,从而避免了这个问题。
The notion of the i-list is an unusual feature of UNIX. In practice, this method of organizing the file system has proved quite reliable and easy to deal with. To the system itself, one of its strengths is the fact that each file has a short, unambiguous name which is related in a simple way to the protection, addressing, and other information needed to access the file. It also permits a quite simple and rapid algorithm for checking the consistency of a file system, for example verification that the portions of each device containing useful information and those free to be allocated are disjoint and together exhaust the space on the device. This algorithm is independent of the directory hierarchy, since it need only scan the linearly-organized i-list. At the same time the notion of the i-list induces certain peculiarities not found in other file system organizations. For example, there is the question of who is to be charged for the space a file occupies, since all directory entries for a file have equal status. Charging the owner of a file is unfair, in general, since one user may create a file, another may link to it, and the first user may delete the file. The first user is still the owner of the file, but it should be charged to the second user. The simplest reasonably fair algorithm seems to be to spread the charges equally among users who have links to a file. The current version of UNIX avoids the issue by not charging any fees at all.
图像是计算机执行环境。它包括核心映像、通用寄存器值、打开文件的状态、当前目录等。图像是伪计算机的当前状态。
An image is a computer execution environment. It includes a core image, general register values, status of open files, current directory, and the like. An image is the current state of a pseudo computer.
进程是图像的执行。当处理器代表进程执行时,映像必须驻留在核心中;在执行其他进程期间,它保留在核心中,除非活动的、优先级较高的进程的出现迫使其被交换到固定头磁盘。
A process is the execution of an image. While the processor is executing on behalf of a process, the image must reside in core; during the execution of other processes it remains in core unless the appearance of an active, higher-priority process forces it to be swapped out to the fixed-head disk.
图像的用户核心部分分为三个逻辑部分。程序文本段从虚拟地址空间中的位置 0 开始。在执行期间,该段是写保护的,并且在执行同一程序的所有进程之间共享它的单个副本。在虚拟地址空间中程序文本段上方的第一个 8K 字节边界处开始一个非共享的、可写的数据段,其大小可以通过系统调用来扩展。从虚拟地址空间的最高地址开始是一个堆栈段,它随着硬件堆栈指针的波动而自动向下增长。
The user-core part of an image is divided into three logical segments. The program text segment begins at location 0 in the virtual address space. During execution, this segment is write-protected and a single copy of it is shared among all processes executing the same program. At the first 8K byte boundary above the program text segment in the virtual address space begins a non-shared, writable data segment, the size of which may be extended by a system call. Starting at the highest address in the virtual address space is a stack segment, which automatically grows downward as the hardware’s stack pointer fluctuates.
processid = fork(标签)
processid = fork (label)
当一个进程执行fork时,它会分裂成两个独立执行的进程。这两个进程拥有原始核心映像的独立副本,并共享任何打开的文件。新进程的不同之处仅在于,其中一个被视为父进程:在父进程中,控制直接从 fork 返回,而在子进程中,控制被传递到位置label。fork调用返回的processid是其他进程的标识。由于父进程和子进程中的返回点不一样,fork后存在的每个镜像都可以判断自己是父进程还是子进程。
When fork is executed by a process, it splits into two independently executing processes. The two processes have independent copies of the original core image, and share any open files. The new processes differ only in that one is considered the parent process: in the parent, control returns directly from the fork, while in the child, control is passed to location label. The processid returned by the fork call is the identification of the other process. Because the return points in the parent and child process are not the same, each image existing after a fork may determine whether it is the parent or child process.
filep = 管道( )
filep = pipe( )
返回文件描述符filep并创建一个称为Pipe 的进程间通道。该通道与其他打开的文件一样,通过fork调用从映像中的父进程传递到子进程。使用管道文件描述符的读取会等待,直到另一个进程使用同一管道的文件描述符进行写入。此时,数据在两个进程的图像之间传递。两个进程都不需要知道涉及的是管道,而不是普通文件。
returns a file descriptor filep and creates an interprocess channel called a pipe. This channel, like other open files, is passed from parent to child process in the image by the fork call. A read using a pipe file descriptor waits until another process writes using the file descriptor for the same pipe. At this point, data are passed between the images of the two processes. Neither process need know that a pipe, rather than an ordinary file, is involved.
尽管通过管道进行进程间通信是一个非常有价值的工具(参见第37.6.2节),但它并不是一个完全通用的机制,因为管道必须由所涉及进程的共同祖先建立。
Although interprocess communication via pipes is a quite valuable tool (see §37.6.2), it is not a completely general mechanism since the pipe must be set up by a common ancestor of the processes involved.
执行(文件,arg 1,arg 2,...,arg n)
execute(file, arg1, arg2, …, argn)
它请求系统读入并执行由file命名的程序,并向其传递字符串参数arg 1 , arg 2 , … , arg n。通常,arg 1应该与file是相同的字符串,以便程序可以确定调用它的名称。使用execute的进程中的所有代码和数据都将从文件中替换,但打开的文件、当前目录和进程间关系保持不变。仅当调用失败时,例如由于找不到文件或未设置其执行权限位,才会从执行原语返回;它类似于“跳转”机器指令而不是子例程调用。
which requests the system to read in and execute the program named by file, passing it string arguments arg1, arg2, …, argn. Ordinarily, arg1 should be the same string as file, so that the program may determine the name by which it was invoked. All the code and data in the process using execute is replaced from the file, but open files, current directory, and interprocess relationships are unaltered. Only if the call fails, for example because file could not be found or because its execute-permission bit was not set, does a return take place from the execute primitive; it resembles a “jump” machine instruction rather than a subroutine call.
进程ID = 等待( )
processid = wait( )
导致其调用者暂停执行,直到其子级之一完成执行。然后wait返回已终止进程的processid 。如果调用进程没有后代,则会返回错误。子进程的某些状态也可用。等待也可能代表来自孙子或更远的祖先的地位;参见第37.5.5节。
causes its caller to suspend execution until one of its children has completed execution. Then wait returns the processid of the terminated process. An error return is taken if the calling process has no descendants. Certain status from the child process is also available. Wait may also present status from a grandchild or more distant ancestor; see §37.5.5.
退出(状态)
exit (status)
终止进程、销毁其映像、关闭其打开的文件,并且通常会删除它。当通过等待原语通知父级时,指示的状态可供父级使用;如果父母已经终止,则该状态可供祖父母使用,依此类推。进程也可能由于各种非法行为或用户生成的信号而终止(下文第37.7节)。
terminates a process, destroys its image, closes its open files, and generally obliterates it. When the parent is notified through the wait primitive, the indicated status is available to the parent; if the parent has already terminated, the status is available to the grandparent, and so on. Processes may also terminate as a result of various illegal actions or user-generated signals (§37.7 below).
对于大多数用户来说,与UNIX 的通信是借助称为 Shell 的程序进行的。Shell 是一个命令行解释器:它读取用户键入的行并将其解释为执行其他程序的请求。在最简单的形式中,命令行由命令名称和后跟命令参数组成,所有参数均以空格分隔:
For most users, communication with UNIX is carried on with the aid of a program called the Shell. The Shell is a command line interpreter: it reads lines typed by the user and interprets them as requests to execute other programs. In simplest form, a command line consists of the command name followed by arguments to the command, all separated by spaces:
命令 arg 1 arg 2 … arg n
command arg1 arg2 … argn
Shell 将命令名称和参数分割成单独的字符串。然后寻找一个名为command的文件;命令可以是包含“/”字符的路径名,用于指定系统中的任何文件。如果找到命令,则将其带入核心并执行。命令可以访问 Shell 收集的参数。命令完成后,Shell 恢复执行,并通过键入提示符来指示它已准备好接受另一个命令。
The Shell splits up the command name and the arguments into separate strings. Then a file with name command is sought; command may be a path name including the “/” character to specify any file in the system. If command is found, it is brought into core and executed. The arguments collected by the Shell are accessible to the command. When the command is finished, the Shell resumes its own execution, and indicates its readiness to accept another command by typing a prompt character.
如果找不到文件命令,Shell 会在命令前添加字符串 /bin/ 前缀,并再次尝试查找该文件。目录/bin包含所有常用命令。
If file command cannot be found, the Shell prefixes the string /bin/ to command and attempts again to find the file. Directory /bin contains all the commands intended to be generally used.
LS
ls
通常在打字机上列出当前目录中的文件的名称。命令
ordinarily lists, on the typewriter, the names of the files in the current directory. The command
ls >那里
ls >there
创建一个名为 There 的文件并将列表放置在那里。因此,参数“>there”的意思是“将输出放在那里”。另一方面,
creates a file called there and places the listing there. Thus the argument “ > there” means, “place output on there.” On the other hand,
编辑
ed
通常进入编辑器,编辑器通过打字机接受用户的请求。命令
ordinarily enters the editor, which takes requests from the user via his typewriter. The command
编辑<脚本
ed <script
将脚本解释为编辑器命令文件;因此“ < script”的意思是“从脚本获取输入”。
interprets script as a file of editor commands; thus “ <script” means, “take input from script.”
虽然“ < ”或“>”后面的文件名看起来是命令的参数,但实际上它完全由 Shell 解释,根本不传递给命令。因此,每个命令中不需要特殊的编码来处理 I/O 重定向;该命令只需在适当的情况下使用标准文件描述符 0 和 1。
Although the file name following “ <” or “ >” appears to be an argument to the command, in fact it is interpreted completely by the Shell and is not passed to the command at all. Thus no special coding to handle I/O redirection is needed within each command; the command need merely use the standard file descriptors 0 and 1 where appropriate.
LS | pr -2 | 操作员
ls | pr -2 | opr
ls列出当前目录下的文件名;它的输出被传递给pr,它用带日期的标题对其输入进行分页。参数“-2”表示双列。同样, pr的输出是opr的输入。该命令将其输入假脱机到文件中以进行离线打印。
ls lists the names of the files in the current directory; its output is passed to pr, which paginates its input with dated headings. The argument “-2” means double column. Likewise the output from pr is input to opr. This command spools its input onto a file for off-line printing.
这个过程本来可以更笨拙地进行
This process could have been carried out more clumsily by
ls > 温度 1
ls > temp1
pr -2 <温度 1 > 温度 2
pr -2 <temp1 >temp2
运算<温度 2
opr <temp2
然后删除临时文件。在缺乏重定向输出和输入的能力的情况下,仍然比较笨拙的方法是要求 ls 命令接受用户请求以对其输出进行分页、以多列格式打印以及安排其输出离线传送。实际上,期望 ls 等命令的作者提供如此广泛的输出选项会令人惊讶,而且事实上出于效率原因也是不明智的。
followed by removal of the temporary files. In the absence of the ability to redirect output and input, a still clumsier method would have been to require the ls command to accept user requests to paginate its output, to print in multicolumn format, and to arrange that its output be delivered off-line. Actually it would be surprising, and in fact unwise for efficiency reasons, to expect authors of commands such as ls to provide such a wide variety of output options.
像pr这样将其标准输入复制到标准输出(经过处理)的程序称为过滤器。我们发现一些有用的过滤器可以执行字符音译、输入排序以及加密和解密。
A program such as pr which copies its standard input to its standard output (with processing) is called a filter. Some filters which we have found useful perform character transliteration, sorting of the input, and encryption and decryption.
ls; 编辑
ls; ed
会先列出当前目录的内容,然后进入编辑器。
will first list the contents of the current directory, then enter the editor.
一个相关的功能更有趣。如果命令后面带有“&”,则 Shell 不会等待命令完成才再次提示;相反,它会立即准备好接受新命令。例如,
A related feature is more interesting. If a command is followed by “&”, the Shell will not wait for the command to finish before prompting again; instead, it is ready immediately to accept a new command. For example,
作为源>输出&
as source > output &
导致源被组装,诊断输出被输出;无论组装需要多长时间,Shell 都会立即返回。当 Shell 不等待命令完成时,就会打印运行该命令的进程的标识。该标识可用于等待命令完成或终止命令。“&”可以在一行中使用多次:
causes source to be assembled, with diagnostic output going to output; no matter how long the assembly takes, the Shell returns immediately. When the Shell does not wait for the completion of a command, the identification of the process running that command is printed. This identification may be used to wait for the completion of the command or to terminate it. The “&” may be used several times in a line:
作为源>输出&ls>文件&
as source > output & ls > files &
在后台进行组装和列表。在上面使用“&”的示例中,提供了打字机以外的输出文件;如果不这样做,各个命令的输出就会混合在一起。
does both the assembly and the listing in the background. In the examples above using “&”, an output file other than the typewriter was provided; if this had not been done, the outputs of the various commands would have been intermingled.
Shell 还允许在上述操作中使用括号。例如,
The Shell also allows parentheses in the above operations. For example,
(日期;ls)> x &
(date; ls) > x &
打印当前日期和时间,后跟文件 x 上的当前目录列表。Shell 还会立即返回另一个请求。
prints the current date and time followed by a list of the current directory onto the file x. The Shell also returns immediately for another request.
作为源
mv a.out testprog
testprog
as source
mv a.out testprog
testprog
mv命令导致文件a.out被重命名为testprog.a.out是汇编器的(二进制)输出,准备好执行。因此,如果在控制台上键入上面的三行,则将汇编源代码,生成名为testprog 的程序,并执行testprog 。当线路处于试用状态时,命令
The mv command causes the file a.out to be renamed testprog.a.out is the (binary) output of the assembler, ready to be executed. Thus if the three lines above were typed on the console, source would be assembled, the resulting program named testprog, and testprog executed. When the lines are in tryout, the command
sh <试用
sh < tryout
会导致 Shell sh顺序执行命令。
would cause the Shell sh to execute the commands sequentially.
Shell 具有更多功能,包括替换参数以及从目录中指定的文件名子集构造参数列表的能力。还可以根据字符串比较或给定文件的存在条件执行命令,并在归档命令序列内执行控制转移。
The Shell has further capabilities, including the ability to substitute parameters and to construct argument lists from a specified subset of the file names in a directory. It is also possible to execute commands conditionally on character string comparisons or on existence of given files and to perform transfers of control within filed command sequences.
有了这个框架,后台进程的实现就很简单了;每当命令行包含“&”时,Shell 只是避免等待它创建的进程来执行该命令。
Given this framework, the implementation of background processes is trivial; whenever a command line contains “&”, the Shell merely refrains from waiting for the process which it created to execute the command.
令人高兴的是,所有这些机制都与标准输入和输出文件的概念很好地结合在一起。当一个进程被fork原语创建时,它不仅继承其父进程的核心映像,还继承其父进程中当前打开的所有文件,包括文件描述符为 0 和 1 的文件。当然,Shell 使用这些文件来读取命令行并编写其提示和诊断信息,在通常情况下,其子项(命令程序)会自动继承它们。然而,当给出带有“ < ”或“>”的参数时,子进程在执行执行之前,使标准 I/O 文件描述符 0 或 1 分别引用指定的文件。这很容易,因为根据协议,当新文件打开ed (或创建d)时,会分配最小的未使用文件描述符;只需要关闭文件 0(或 1)并打开指定的文件。因为命令程序运行的进程在结束时就会终止,所以当进程终止时,“ < ”或“>”后指定的文件与文件描述符0或1之间的关联会自动结束。因此,Shell 不需要知道作为其自己的标准输入和输出的文件的实际名称,因为它不需要重新打开它们。
Happily, all of this mechanism meshes very nicely with the notion of standard input and output files. When a process is created by the fork primitive, it inherits not only the core image of its parent but also all the files currently open in its parent, including those with file descriptors 0 and 1. The Shell, of course, uses these files to read command lines and to write its prompts and diagnostics, and in the ordinary case its children—the command programs—inherit them automatically. When an argument with “ <” or “ >” is given however, the offspring process, just before it performs execute, makes the standard I/O file descriptor 0 or 1 respectively refer to the named file. This is easy because, by agreement, the smallest unused file descriptor is assigned when a new file is opened (or created); it is only necessary to close file 0 (or 1) and open the named file. Because the process in which the command program runs simply terminates when it is through, the association between a file specified after “ <” or “ >” and file descriptor 0 or 1 is ended automatically when the process dies. Therefore the Shell need not know the actual names of the files which are its own standard input and output since it need never reopen them.
过滤器是标准 I/O 重定向的直接扩展,使用管道而不是文件。
Filters are straightforward extensions of standard I/O redirection with pipes used instead of files.
一般情况下,Shell 的主循环永远不会终止。(主循环包括属于父进程的 fork 返回的分支;即执行等待,然后读取另一个命令行的分支。)导致 Shell 终止的一件事是发现一个 end-of - 输入文件的文件条件。因此,当 Shell 作为具有给定输入文件的命令执行时,如下所示
In ordinary circumstances, the main loop of the Shell never terminates. (The main loop includes that branch of the return from fork belonging to the parent process; that is, the branch which does a wait, then reads another command line.) The one thing which causes the Shell to terminate is discovering an end-of-file condition on its input file. Thus, when the Shell is executed as a command with a given input file, as in
sh <编译文件
sh < comfile
comfile中的命令将被执行,直到到达comfile末尾;那么 sh 调用的 Shell 实例将终止。由于此 Shell 进程是另一个 Shell 实例的子进程,因此后者中执行的等待将返回,并且可能会处理另一个命令。
the commands in comfile will be executed until the end of comfile is reached; then the instance of the Shell invoked by sh will terminate. Since this Shell process is the child of another instance of the Shell, the wait executed in the latter will return, and another command may be processed.
同时, init的主流路径(其自身所有子实例的父实例,稍后将成为Shell)会等待。如果其中一个子进程终止,或者是因为 Shell 发现文件结尾,或者是因为用户输入了不正确的名称或密码,则init的此路径只会重新创建已失效的进程,从而重新打开相应的输入和输出文件和类型另一条登录消息。因此,用户只需在 Shell 中输入文件结束序列而不是命令即可注销。
Meanwhile, the mainstream path of init (the parent of all the subinstances of itself which will later become Shells) does a wait. If one of the child processes terminates, either because a Shell found an end of file or because a user typed an incorrect name or password, this path of init simply recreates the defunct process, which in turn reopens the appropriate input and output files and types another login message. Thus a user may log out simply by typing the end-of-file sequence in place of a command to the Shell.
回想一下,用户通过提供用户名和密码成功登录后,init通常会调用 Shell 来解释命令行。密码文件中的用户条目可能包含登录后要调用的程序的名称,而不是 Shell。该程序可以自由地以任何它希望的方式解释用户的消息。
Recall that after a user has successfully logged in by supplying his name and password, init ordinarily invokes the Shell to interpret command lines. The user’s entry in the password file may contain the name of a program to be invoked after login instead of the Shell. This program is free to interpret the user’s messages in any way it wishes.
例如,秘书编辑系统用户的密码文件条目指定使用编辑器ed而不是 Shell。这样当编辑系统用户登录后,就处于编辑器内部,可以立即开始工作;此外,还可以阻止他们调用不适合他们使用的UNIX程序。实际上,事实证明,允许暂时退出编辑器以执行格式化程序和其他实用程序是可取的。
For example, the password file entries for users of a secretarial editing system specify that the editor ed is to be used instead of the Shell. Thus when editing system users log in, they are inside the editor and can begin work immediately; also, they can be prevented from invoking UNIX programs not intended for their use. In practice, it has proved desirable to allow a temporary escape from the editor to execute the formatting program and other utilities.
UNIX上提供的几种游戏(例如国际象棋、二十一点、3D 井字棋)说明了更加严格的受限环境。对于其中的每一个,密码文件中都存在一个条目,指定要调用适当的游戏程序而不是 Shell。作为其中一款游戏的玩家登录的人们会发现自己仅限于游戏,无法研究 UNIX整体上可能更有趣的产品。
Several of the games (e.g. chess, blackjack, 3D tic-tac-toe) available on UNIX illustrate a much more severely restricted environment. For each of these an entry exists in the password file specifying that the appropriate game-playing program is to be invoked instead of the Shell. People who log in as a player of one of the games find themselves limited to the game and unable to investigate the presumably more interesting offerings of UNIX as a whole.
PDP-11 硬件检测许多程序错误,例如对不存在的内存的引用、未实现的指令以及在需要偶数地址时使用奇数地址。此类故障会导致处理器陷入系统例程。当发现非法行为时,除非另有安排,否则系统将终止该进程并写入当前目录中文件 core 上的用户图像。调试器可用于确定发生故障时程序的状态。
The PDP-11 hardware detects a number of program faults, such as references to nonexistent memory, unimplemented instructions, and odd addresses used where an even address is required. Such faults cause the processor to trap to a system routine. When an illegal action is caught, unless other arrangements have been made, the system terminates the process and writes the user’s image on file core in the current directory. A debugger can be used to determine the state of the program at the time of the fault.
正在循环、产生不需要的输出或用户重新考虑的程序可以通过使用中断信号来停止,该信号是通过键入“删除”字符生成的。除非采取特殊操作,否则该信号只会导致程序停止执行,而不生成核心映像文件。
Programs which are looping, which produce unwanted output, or about which the user has second thoughts may be halted by the use of the interrupt signal, which is generated by typing the “delete” character. Unless special action has been taken, this signal simply causes the program to cease execution without producing a core image file.
还有一个退出信号用于强制生成核心映像。因此,意外循环的程序可能会被停止,并在没有预先安排的情况下检查核心映像。
There is also a quit signal which is used to force a core image to be produced. Thus programs which loop unexpectedly may be halted and the core image examined without prearrangement.
根据请求,硬件生成的故障以及中断和退出信号可以被进程忽略或捕获。例如,Shell 会忽略退出以防止退出导致用户注销。编辑器捕获中断并返回到其命令级别。这对于停止长打印输出而不丢失正在进行的工作非常有用(编辑器操纵它正在编辑的文件的副本)。在没有浮点硬件的系统中,未实现的指令将被捕获,并解释浮点指令。
The hardware-generated faults and the interrupt and quit signals can, by request, be either ignored or caught by the process. For example, the Shell ignores quits to prevent a quit from logging the user out. The editor catches interrupts and returns to its command level. This is useful for stopping long printouts without losing work in progress (the editor manipulates a copy of the file it is editing). In systems without floating point hardware, unimplemented instructions are caught, and floating point instructions are interpreted.
也许矛盾的是, UNIX的成功很大程度上是因为它的设计目的不是为了满足任何预定义的目标。当我们中的一个人(汤普森)对可用的计算机设施不满意时,发现了一个很少使用的系统 PDP-7,并开始创建一个更友好的环境,从而编写了第一个版本。这种本质上是个人的努力非常成功,引起了剩下的作者和其他人的兴趣,后来证明了购买 PDP-11/20 的合理性,特别是为了支持文本编辑和格式化系统。然后 11/20 又被淘汰了,事实证明UNIX足够有用,足以说服管理层投资 PDP-11/45。我们整个努力的目标,无论何时明确,始终关注与机器建立舒适的关系,并探索操作系统中的想法和发明。我们不需要满足别人的要求,我们对这种自由心怀感激。回顾起来,影响UNIX设计的三个考虑因素是显而易见的。
Perhaps paradoxically, the success of UNIX is largely due to the fact that it was not designed to meet any predefined objectives. The first version was written when one of us (Thompson), dissatisfied with the available computer facilities, discovered a little-used system PDP-7 and set out to create a more hospitable environment. This essentially personal effort was sufficiently successful to gain the interest of the remaining author and others, and later to justify the acquisition of the PDP-11/20, specifically to support a text editing and formatting system. Then in turn the 11/20 was outgrown, UNIX had proved useful enough to persuade management to invest in the PDP-11/45. Our goals throughout the effort, when articulated at all, have always concerned themselves with building a comfortable relationship with the machine and with exploring ideas and inventions in operating systems. We have not been faced with the need to satisfy someone else’s requirements, and for this freedom we are grateful. Three considerations which influenced the design of UNIX are visible in retrospect.
首先,既然我们是程序员,我们自然会设计系统以方便编写、测试和运行程序。我们对编程便利性的渴望的最重要体现是系统被安排为交互式使用,尽管原始版本仅支持一个用户。我们相信,正确设计的交互系统比“批处理”系统更高效、更令人满意。此外,这样的系统相当容易适应非交互式使用,但反之则不然。其次,系统及其软件一直存在相当严格的尺寸限制。鉴于对合理效率和表现力的偏向敌对愿望,尺寸限制不仅鼓励经济,而且鼓励设计的一定优雅。这可能是“通过苦难得救”哲学的一个几乎不加掩饰的版本,但在我们的例子中它起作用了。
First, since we are programmers, we naturally designed the system to make it easy to write, test, and run programs. The most important expression of our desire for programming convenience was that the system was arranged for interactive use, even though the original version only supported one user. We believe that a properly designed interactive system is much more productive and satisfying to use than a “batch” system. Moreover such a system is rather easily adaptable to noninteractive use, while the converse is not true. Second there have always been fairly severe size constraints on the system and its software. Given the partiality antagonistic desires for reasonable efficiency and expressive power, the size constraint has encouraged not only economy but a certain elegance of design. This may be a thinly disguised version of the “salvation through suffering” philosophy, but in our case it worked.
第三,几乎从一开始,该系统就能够并且确实进行了自我维护。这个事实比看起来更重要。如果系统的设计者被迫使用该系统,他们很快就会意识到其功能和表面的缺陷,并有强烈的动力去纠正它们,以免为时已晚。由于所有源程序始终可用且易于在线修改,因此当其他人发明、发现或建议新想法时,我们愿意修改和重写系统及其软件。
Third, nearly from the start, the system was able to, and did, maintain itself. This fact is more important than it might seem. If designers of a system are forced to use that system, they quickly become aware of its functional and superficial deficiencies and are strongly motivated to correct them before it is too late. Since all source programs were always available and easily modified on-line, we were willing to revise and rewrite the system and its software when new ideas were invented, discovered, or suggested by others.
本文讨论的UNIX方面至少清楚地展示了这些设计考虑因素中的前两个。例如,从编程的角度来看,文件系统的接口非常方便。尽可能低的接口级别旨在消除各种设备和文件之间以及直接访问和顺序访问之间的区别。不需要大型“访问方法”例程来使程序员免受系统调用的影响;事实上,所有用户程序要么直接调用系统,要么使用一个小型库程序,只有几十条指令长,它缓冲许多字符并一次读取或写入它们。
The aspects of UNIX discussed in this paper exhibit clearly at least the first two of these design considerations. The interface to the file system, for example, is extremely convenient from a programming standpoint. The lowest possible interface level is designed to eliminate distinctions between the various devices and files and between direct and sequential access. No large “access method” routines are required to insulate the programmer from the system calls; in fact, all user programs either call the system directly or use a small library program, only tens of instructions long, which buffers a number of characters and reads or writes them all at once.
编程便利性的另一个重要方面是不存在由文件系统或其他系统调用部分维护和依赖的具有复杂结构的“控制块”。一般来说,程序地址空间的内容是程序的属性,我们试图避免对该地址空间内的数据结构施加限制。
Another important aspect of programming convenience is that there are no “control blocks” with a complicated structure partially maintained by and depended on by the file system or other system calls. Generally speaking, the contents of a program’s address space are the property of the program, and we have tried to avoid placing restrictions on the data structures within that address space.
考虑到所有程序都应可与任何文件或设备一起用作输入或输出的要求,从空间效率的角度来看,还希望将与设备相关的考虑因素推入操作系统本身。唯一的选择似乎是加载用于处理每个设备的所有程序的例程,这在空间上是昂贵的,或者依赖于在实际需要时动态链接到适合每个设备的例程的某种方法,这也很昂贵在开销或硬件中。
Given the requirement that all programs should be usable with any file or device as input or output, it is also desirable from a space-efficiency standpoint to push device-dependent considerations into the operating system itself. The only alternatives seem to be to load routines for dealing with each device with all programs, which is expensive in space, or to depend on some means of dynamically linking to the routine appropriate to each device when it is actually needed, which is expensive either in overhead or in hardware.
同样,过程控制方案和命令界面也被证明既方便又高效。由于 Shell 作为普通的、可交换的用户程序运行,因此它不消耗系统本身的有线空间,并且可以以很少的成本使其功能强大,特别是考虑到 Shell 在其中作为产生其他进程来执行命令的进程、I/O 重定向、后台进程、命令文件和用户可选择的系统接口的概念都变得基本上实现起来很简单。
Likewise, the process control scheme and command interface have proved both convenient and efficient. Since the Shell operates as an ordinary, swappable user program, it consumes no wired-down space in the system proper, and it may be made as powerful as desired at little cost, In particular, given the framework in which the Shell executes as a process which spawns other processes to perform commands, the notions of I/O redirection, background processes, command files, and user-selectable system interfaces all become essentially trivial to implement.
……
…
经计算机协会许可,转载自 Ritchie 和 Thompson (1974)。
Reprinted from Ritchie and Thompson (1974), with permission from the Association for Computing Machinery.
计算机网络满足了多种需求,并在商业、政府和科学举措的影响下形成。1957 年苏联太空卫星Sputnik的发射导致美国国防部成立了高级研究计划局 (ARPA),其主要任务是促进研究,以避免未来出现此类令人震惊的意外事件。早在 1960 年,兰德公司的研究员 Paul Baran 在空军的资助下就提出,冗余、分布式、分组交换网络可以在可能削弱美国电信网络的核攻击中幸存下来(Baran,1964) 。与此同时,JCR Licklider 在出版《人机共生》(第 20 章)后,开始推测“星际计算机网络”,并在 1964 年成为 ARPA 信息处理技术办公室 (IPTO) 负责人时,开始慷慨资助计算机网络他在 IPTO 的继任者罗伯特·泰勒 (Robert Taylor) 继续开展这项研究。泰勒使用 100 万美元的预算来远程访问 ARPA 资助的计算机,从而创建了后来的ARPA NET ,后来成为互联网。泰勒指派劳伦斯“拉里”罗伯茨设计网络,罗伯茨在与巴兰和英国研究员唐纳德戴维斯(当时正在开发相关想法)联系后,确定了分组交换设计,并于 1968 年发布了征求建议书构建 ARPA NET硬件和软件。罗伯茨接替泰勒担任 IPTO 负责人,为网络计划提供了连续性。
Computer networking responded to several needs and took shape under the influence of commercial, governmental, and scientific initiatives. The launch of the Soviet space satellite Sputnik in 1957 resulted in the formation of the Advanced Research Projects Agency (ARPA) in the U.S. Department of Defense, essentially tasked with fostering research that would avoid such shocking surprises in the future. As early as 1960, Paul Baran, a researcher at the RAND Corporation with funding from the Air Force, was proposing that a redundant, distributed, packet-switched network could survive a nuclear attack that might cripple the U.S. telecommunications grid (Baran, 1964). Meanwhile J. C. R. Licklider, after publishing Man-Computer Symbiosis (chapter 20), began speculating about an “Intergalactic Computer Network,” and when he became the head of the Information Processing Technology Office (IPTO) within ARPA in 1964, started generously funding computer networking research, an initiative that was continued by his successor at IPTO, Robert Taylor. Taylor started what would become the ARPANET, which became the internet, by using a million dollars of his budget to enable remote access to ARPA-funded computers. Taylor tasked Lawrence “Larry” Roberts with designing the network, and Roberts, after connecting with Baran and the British researcher Donald Davies (who was developing related ideas), settled on the packet-switched design and, in 1968, issued the request for proposals to build the ARPANET hardware and software. Roberts succeeded Taylor as head of IPTO, providing continuity to the networking initiatives.
与此同时,IBM、DEC 和其他计算机制造商正在开发自己的网络,以便共享数据和外围设备,但制造商没有动力就网络标准达成一致,使竞争对手的机器能够加入他们的机器网络。Vinton Cerf(生于 1943 年)和 Robert Kahn(生于 1938 年)通过开发一套用于将网络连接在一起的协议,将 ARPA NET转变为互联网。
Meanwhile, IBM, DEC, and other computer manufacturers were developing their own networks so to share data and peripheral devices, but the manufacturers had no incentive to agree on a networking standard that would enable competitors’ machines to join a network of their machines. Vinton Cerf (b. 1943) and Robert Kahn (b. 1938) turned the ARPANET into the internet by developing a set of protocols for connecting networks together.
Cerf 和 Kahn 都曾参与早期的 ARPA NET项目,Cerf 在 UCLA 工作,Kahn 在 Bolt Beranek and Newman (BBN) 公司工作,该公司赢得了最初的 ARPA NET合同。1973 年,Cerf 和 Kahn 开始合作设计网络互联标准和协议,并发表了这篇开创性的论文。它所描述的基本设计仍然保留,只是进行了一项重大修改。论文中所说的 TCP 在 1978 年被分成一组两个协议:TCP/IP——主机协议 TCP 和独立的互联网协议 IP。TCP 依赖于 IP(一种在网关上运行的更简单且不可靠的协议)在主机之间提供可靠的端到端服务。
Cerf and Kahn had both worked on the early ARPANET project, Cerf at UCLA and Kahn at Bolt Beranek and Newman (BBN), the firm that had won the initial ARPANET contract. In 1973, Cerf and Kahn began to collaborate on the design of standards and protocols for internetworking, which yielded this seminal paper. The basic design it describes remains in place, with one major revision. What the paper calls TCP was in 1978 split into a set of two protocols, TCP/IP—the host protocol TCP and the separate Internet Protocol IP. TCP provides reliable end-to-end service between hosts, relying on IP, a simpler and unreliable protocol running on gateways.
瑟夫和卡恩不仅在推动互联网技术方面发挥了重要作用,而且在推动互联网作为全球信息共享资源的精神方面发挥了重要作用。他们获得图灵奖(ACM,2004 年),表彰他们的技术贡献和“网络领域富有灵感的领导力”。
Cerf and Kahn have both played major roles in promoting not just internet technology but the spirit of the internet as a global information sharing resource. Their Turing Award citation (ACM, 2004) recognizes them both for their technical contributions and for “inspired leadership in networking.”
提出了一种支持不同分组交换网络中存在的资源共享的协议。该协议提供了各个网络数据包大小、传输故障、排序、流量控制、端到端错误检查以及逻辑进程到进程连接的创建和销毁的变化。考虑了一些实现问题,并暴露了诸如互联网络路由、计费和超时等问题。
A protocol that supports the sharing of resources that exist in different packet switching networks is presented. The protocol provides for variation in individual network packet sizes, transmission failures, sequencing, flow control, end-to-end error checking, and the creation and destruction of logical process-to-process connections. Some implementation issues are considered, and problems such as internetwork routing, accounting, and timeouts are exposed.
在过去的几年里,人们在分组交换网络的设计和实现上投入了大量的精力(Roberts and Wessler, 1970a; Pouzin, 1973b; Dell, 1971; Scantlebury and Wilkinson, 1971; Barber, 1972; Despres, 1972; Kahn and克劳瑟,1972)。开发此类网络的主要原因是促进计算机资源的共享。分组通信网络包括用于在计算机之间或计算机与终端之间传送数据的传输机制。为了使数据有意义,计算机和终端共享一个通用协议(即一组商定的约定)。为此目的已经开发了几种协议(Chambon 等人,1973 年;Carr 等人,1970 年;McKenzie,1972 年;Pouzin,1973a;Walden,1972 年;McKenzie,1973 年)。然而,这些协议仅解决了同一网络上的通信问题。在本文中,我们提出了一种协议设计和理念,支持不同分组交换网络中存在的资源共享。
In the last few years considerable effort has been expended on the design and implementation of packet switching networks (Roberts and Wessler, 1970a; Pouzin, 1973b; Dell, 1971; Scantlebury and Wilkinson, 1971; Barber, 1972; Despres, 1972; Kahn and Crowther, 1972). A principal reason for developing such networks has been to facilitate the sharing of computer resources. A packet communication network includes a transportation mechanism for delivering data between computers or between computers and terminals. To make the data meaningful, computer and terminals share a common protocol (i.e., a set of agreed upon conventions). Several protocols have already been developed for this purpose (Chambon et al., 1973; Carr et al., 1970; McKenzie, 1972; Pouzin, 1973a; Walden, 1972; McKenzie, 1973). However, these protocols have addressed only the problem of communication on the same network. In this paper we present a protocol design and philosophy that supports the sharing of resources that exist in different packet switching networks.
在简要介绍了互联网络协议问题之后,我们描述了网关作为网络之间接口的功能,并讨论了它在协议中的作用。然后我们考虑协议的各种细节,包括寻址、格式化、缓冲、排序、流控制、错误控制等等。我们最后描述了进程间通信机制,并展示了互联网络协议如何支持它。
After a brief introduction to internetwork protocol issues, we describe the function of a GATEWAY as an interface between networks and discuss its role in the protocol. We then consider the various details of the protocol, including addressing, formatting, buffering, sequencing, flow control, error control, and so forth. We close with a description of an interprocess communication mechanism and show how it can be supported by the internetwork protocol.
尽管在设计单个分组交换网络时必须解决许多不同且复杂的问题,但当不同的网络互连时,这些问题显然会变得更加复杂。出现的问题可能在单个网络中没有直接对应的问题,并且强烈影响网络通信的发生方式。
Even though many different and complex problems must be solved in the design of an individual packet switching network, these problems are manifestly compounded when dissimilar networks are interconnected. Issues arise which may have no direct counterpart in an individual network and which strongly influence the way in which internetwork communication can take place.
典型的分组交换网络由一组称为主机的计算机资源、一组一个或多个分组交换机以及互连分组交换机的通信介质的集合组成。在每个HOST中,我们假设存在必须存在的进程与自己或其他主机中的进程进行通信。任何当前的流程定义都足以满足我们的目的(Lampson,1968)。这些进程通常是网络中数据的最终来源和目的地。通常,在单个网络内,存在用于任何源进程和目标进程之间通信的协议。只有源进程和目标进程需要了解此约定才能进行通信。两个不同网络中的进程通常会使用不同的协议来实现此目的。分组交换和通信介质的集合称为分组交换子网。图 38.1说明了这些想法。
A typical packet switching network is composed of a set of computer resources called HOSTS, a set of one or more packet switches, and a collection of communication media that interconnect the packet switches. Within each HOST, we assume that there exist processes which must communicate with processes in their own or other HOSTS. Any current definition of a process will be adequate for our purposes (Lampson, 1968). These processes are generally the ultimate source and destination of data in the network. Typically, within an individual network, there exists a protocol for communication between any source and destination process. Only the source and destination processes require knowledge of this convention for communication to take place. Processes in two distinct networks would ordinarily use different protocols for this purpose. The ensemble of packet switches and communication media is called the packet switching subnet. Figure 38.1 illustrates these ideas.
图 38.1: 典型的分组交换网络
Figure 38.1: Typical packet switching network
在典型的分组交换子网中,从源主机接受固定最大大小的数据,以及用于以存储和转发方式路由数据的格式化目标地址。该数据的传输时间通常取决于内部网络参数,例如通信媒体数据速率、缓冲和信令策略、路由、传播延迟等。此外,通常存在一些机制用于错误处理和确定网络状态成分。
In a typical packet switching subnet, data of a fixed maximum size are accepted from a source HOST, together with a formatted destination address which is used to route the data in a store and forward fashion. The transmit time for this data is usually dependent upon internal network parameters such as communication media data rates, buffering and signaling strategies, routing, propagation delays, etc. In addition, some mechanism is generally present for error handling and determination of status of the network’s components.
各个分组交换网络的实现可能有所不同,如下所示。
Individual packet switching networks may differ in their implementations as follows.
1. 每个网络可能有不同的方式来对接收器进行寻址,因此需要创建每个单独的网络都可以理解的统一寻址方案。
1. Each network may have distinct ways of addressing the receiver, thus requiring that a uniform addressing scheme be created which can be understood by each individual network.
2.每个网络可以接受不同最大尺寸的数据,因此要求网络以最小的最大尺寸(这可能小得不切实际)为单位进行处理,或者需要允许跨越网络边界的数据被重新格式化为更小的片段的程序。
2. Each network may accept data of different maximum size, thus requiring networks to deal in units of the smallest maximum size (which may be impractically small) or requiring procedures which allow data crossing a network boundary to be reformatted into smaller pieces.
3. 传输的成功或失败及其在每个网络中的性能取决于接受、交付和传输数据的不同时间延迟。这需要仔细开发互联网络定时程序,以确保数据可以通过各种网络成功传送。
3. The success or failure of a transmission and its performance in each network is governed by different time delays in accepting, delivering, and transporting the data. This requires careful development of internetwork timing procedures to insure that data can be successfully delivered through the various networks.
4. 在每个网络内,通信可能由于不可恢复的数据突变或数据丢失而中断。为了从这些情况中完全恢复,需要端到端的恢复过程。
4. Within each network, communication may be disrupted due to unrecoverable mutation of the data or missing data. End-to-end restoration procedures are desirable to allow complete recovery from these conditions.
5. 每个网络中的状态信息、路由、故障检测和隔离通常是不同的。因此,为了获得某些条件的验证,例如不可访问或死的目的地,必须在通信网络之间调用各种协调。
5. Status information, routing, fault detection, and isolation are typically different in each network. Thus, to obtain verification of certain conditions, such as an inaccessible or dead destination, various kinds of coordination must be invoked between the communicating networks.
如果网络之间的所有差异都可以通过网络边界处的适当接口来经济地解决,那将非常方便。对于许多差异,这个目标是可以实现的。然而,经济和技术方面的考虑使我们更喜欢接口尽可能简单和可靠,并且主要处理使用不同数据包交换策略的网络之间传递数据。
It would be extremely convenient if all the differences between networks could be economically resolved by suitable interfacing at the network boundaries. For many of the differences, this objective can be achieved. However, both economic and technical considerations lead us to prefer that the interface be as simple and reliable as possible and deal primarily with passing data between networks that use different packet switching strategies.
现在出现的问题是,接口是否应该通过将源约定转换为相应的目标来解决主机或进程级协议的差异惯例。显然,我们希望允许接口处的数据包交换策略之间的转换,以允许现有网络和规划网络的互连。然而,主机或进程级协议的复杂性和差异性使得人们希望避免在接口处进行它们之间的转换,即使这种转换总是可能的。相反,必须开发兼容的主机和进程级协议以实现有效的网络资源共享。不可接受的替代方案是每个主机或进程都实现与其他网络通信可能需要的每个协议(可能是无限的数量)。因此,我们假设不同网络中的主机或进程之间使用通用协议,并且网络之间的接口在该协议中应扮演尽可能小的角色。
The question now arises as to whether the interface ought to account for differences in HOST or process level protocols by transforming the source conventions into the corresponding destination conventions. We obviously want to allow conversion between packet switching strategies at the interface, to permit interconnection of existing and planned networks. However, the complexity and dissimilarity of the HOST or process level protocols makes it desirable to avoid having to transform between them at the interface, even if this transformation were always possible. Rather, compatible HOST and process level protocols must be developed to achieve effective internetwork resource sharing. The unacceptable alternative is for every HOST or process to implement every protocol (a potentially unbounded number) that may be needed to communicate with other networks. We therefore assume that a common protocol is to be used between HOSTS or processes in different networks and that the interface between networks should take as small a role as possible in this protocol.
为了让不同归属的网络能够互联,毫无疑问需要对经过接口的流量进行一些计费。用最简单的术语来说,这涉及对每个网络处理的数据包进行统计,费用从一个网络传递到另一个网络,直到责任最终落在用户或其代表身上。此外,互连必须保持每个单独网络的内部操作完好无损。如果两个网络互连,就像每个网络都是另一个网络的主机一样,但不利用或实际上合并任何复杂的主机协议转换,那么这很容易实现。因此,显然网络之间的接口必须在任何网络互连策略的开发中发挥核心作用。我们为执行这些功能的接口指定一个特殊的名称,并将其称为GATEWAY。
To allow networks under different ownership to interconnect, some accounting will undoubtedly be needed for traffic that passes across the interface. In its simplest terms, this involves an accounting of packets handled by each net for which charges are passed from net to net until the buck finally stops at the user or his representative. Furthermore, the interconnection must preserve intact the internal operation of each individual network. This is easily achieved if two networks interconnect as if each were a HOST to the other network, but without utilising or indeed incorporating any elaborate HOST protocol transformations. It is thus apparent that the interface between networks must play a central role in the development of any network interconnection strategy. We give a special name to this interface that performs these functions and call it a GATEWAY.
在图 38.2中,我们展示了标记为A、B和C的三个独立网络,它们由网关 M和N连接。GATEWAY M连接网络A和网络B,GATEWAY N连接网络B和网络C。我们假设单个网络可能有多个网关(例如,网络B),并且可能有多个网关路径用于在一对网络之间进行传输。正确路由数据的责任在于网关。
In Figure 38.2 we illustrate three individual networks labelled A, B, and C which are joined by GATEWAYS M and N. GATEWAY M interfaces network A with network B, and GATEWAY N interfaces network B to network C. We assume that an individual network may have more than one GATEWAY (e.g., network B) and that there may be more than one GATEWAY path to use in going between a pair of networks. The responsibility for properly routing data resides in the GATEWAY.
图 38.2:通过两个 网关互连的三个网络
Figure 38.2: Three networks interconnected by two GATEWAYS
实际上,两个网络之间的网关可能由两半组成,每一半都与其自己的网络相关联。可以实现网关的每一半,因此它只需要以本地数据包格式嵌入互联网数据包或提取它们。我们建议网关 以标准格式处理网络数据包,但我们不建议网关两半之间的任何特定传输过程。
In practice, a GATEWAY between two networks may be composed of two halves, each associated with its own network. It is possible to implement each half of a GATEWAY so it need only embed internetwork packets in local packet format or extract them. We propose that the GATEWAY handle internetwork packets in a standard format, but we are not proposing any particular transmission procedure between GATEWAY halves.
现在让我们追踪互连网络中的数据流。我们假设来自进程X的数据包进入网络A,目的地是网络C中的进程Y。Y的地址最初由进程X指定, GATEWAY M的地址源自进程Y的地址。我们没有尝试指定GATEWAY的选择是由进程X、其HOST还是网络A中的其中一台数据包交换机做出的。数据包穿过网络A直到到达网关M。在网关处,数据包被重新格式化以满足网络B的要求,并考虑到A和B之间的流量单位,然后网关将数据包传送到网络B。同样,下一个网关地址的推导是基于目的地Y的地址完成的。在本例中,GATEWAY N是下一个。数据包穿过网络B直到最终到达网关N,在网关 N 中数据包被格式化以满足网络C的要求。再次考虑网络B和C之间的该流量单位。进入网络C后,数据包被路由到进程Y所在的主机,并在那里被传递到其最终目的地。
Let us now trace the flow of data through the interconnected networks. We assume a packet of data from process X enters network A destined for process Y in network C. The address of Y is initially specified by process X and the address of GATEWAY M is derived from the address of process Y. We make no attempt to specify whether the choice of GATEWAY is made by process X, its HOST, or one of the packet switches in network A. The packet traverses network A until it reaches GATEWAY M. At the GATEWAY, the packet is reformatted to meet the requirements of network B, account is taken of this unit of flow between A and B, and the GATEWAY delivers the packet to network B. Again the derivation of the next GATEWAY address is accomplished based on the address of the destination Y. In this case, GATEWAY N is the next one. The packet traverses network B until it finally reaches GATEWAY N where it is formatted to meet the requirements of network C. Account is again taken of this unit of flow between networks B and C. Upon entering network C, the packet is routed to the HOST in which process Y resides and there it is delivered to its ultimate destination.
由于网关必须了解源主机和目标主机的地址,因此到达网关的每个数据包中必须以标准格式提供此信息。该信息包含在源HOST为数据包添加前缀的互联网络标头中。数据包格式,包括网络报头,如图 38.3所示。源条目和目标条目统一且唯一地标识复合网络中每个主机的地址。寻址是一个相当复杂的主题,下一节将更详细地讨论。标头中接下来的两个条目提供序列号和字节计数,可用于在传送到目的地时对数据包进行正确排序,并且还可以使网关能够检测影响数据包的故障情况。标志字段用于传达特定的控制信息,将在稍后的重传和重复检测部分中讨论。数据包的其余部分包含以下文本:传送到目的地以及用于端到端软件验证的尾随校验和。网关不修改文本,仅转发校验和,而不计算或重新计算它。
Since the GATEWAY must understand the address of the source and destination HOSTS, this information must be available in a standard format in every packet which arrives at the GATEWAY. This information is contained in an internetwork header prefixed to the packet by the source HOST. The packet format, including the internetwork header, is illustrated in Figure 38.3. The source and destination entries uniformly and uniquely identify the address of every HOST in the composite network. Addressing is a subject of considerable complexity which is discussed in greater detail in the next section. The next two entries in the header provide a sequence number and a byte count that may be used to properly sequence the packets upon delivery to the destination and may also enable the GATEWAYS to detect fault conditions affecting the packet. The flag field is used to convey specific control information and is discussed in the section on retransmission and duplicate detection later. The remainder of the packet consists of text for delivery to the destination and a trailing check sum used for end-to-end software verification. The GATEWAY does not modify the text and merely forwards the check sum along without computing or recomputing it.
图 38.3: 互联网数据包格式(字段未按比例显示)
Figure 38.3: Internetwork packet format (fields not shown to scale)
每个网络可能需要增强数据包格式,然后才能通过各个网络。我们在图中指出了一个本地标头,它位于数据包的开头。引入该本地报头仅仅是为了说明以数据包必须通过的单独网络的格式嵌入互联网数据包的概念。显然,它的具体形式因网络而异,在某些情况下甚至可能是不必要的。尽管图中未明确指示,但也可能将本地尾部附加到数据包的末尾。除非立法上限制所有传输的数据包足够小以被每个单独的网络接受,否则网关可能被迫将数据包分割成两个或更多个较小的数据包。此操作称为分段,并且必须以目的地能够将分段数据包拼凑在一起的方式完成。很明显,互联网络标头格式规定了所有网络必须承载的最小数据包大小(显然所有网络都希望承载大于此最小值的数据包)。我们认为,由于以下原因,指定数据包大小可以比最小数据包大多少将严重抑制互联网通信的长期增长和发展。
Each network may need to augment the packet format before it can pass through the individual network. We have indicated a local header in the figure which is prefixed to the beginning of the packet. This local header is introduced merely to illustrate the concept of embedding an internetwork packet in the format of the individual network through which the packet must pass. It will obviously vary in its exact form from network to network and may even be unnecessary in some cases. Although not explicitly indicated in the figure, it is also possible that a local trailer may be appended to the end of the packet. Unless all transmitted packets are legislatively restricted to be small enough to be accepted by every individual network, the GATEWAY may be forced to split a packet into two or more smaller packets. This action is called fragmentation and must be done in such a way that the destination is able to piece together the fragmented packet. It is clear that the internetwork header format imposes a minimum packet size which all networks must carry (obviously all networks will want to carry packets larger than this minimum). We believe the long range growth and development of internetwork communication would be seriously inhibited by specifying how much larger than the minimum a packet size can be, for the following reasons.
1. 如果指定了最大允许数据包大小,则不可能将一个网络的内部数据包大小参数与所有其他网络的内部数据包大小参数完全隔离。
1. If a maximum permitted packet size is specified then it becomes impossible to completely isolate the internal packet size parameters of one network from the internal packet size parameters of all other networks.
2. 响应新技术(例如大型存储器系统、更高数据速率的通信设施等)而增加最大允许数据包大小将是非常困难的,因为这需要所有参与网络达成一致并随后实施。
2. It would be very difficult to increase the maximum permitted packet size in response to new technology (e.g. large memory systems, higher data rate communication facilities, etc.) since this would require the agreement and then implementation by all participating networks.
3. 关联寻址和分组加密可能需要在传输过程中扩展特定分组的大小以合并新信息。
3. Associative addressing and packet encryption may require the size of a particular packet to expand during transit for incorporation of new information.
分段的规定(无论在何处执行)允许在单个网络的基础上处理数据包大小变化,而无需全局管理,并且还允许主机和进程免受任何网络中允许的数据包大小的变化的影响,通过这些网络,它们的数据必须通过经过。
Provision for fragmentation (regardless of where it is performed) permits packet size variations to be handled on an individual network basis without global administration and also permits HOSTS and processes to be insulated from changes in the packet sizes permitted in any networks through which their data must pass.
如果必须进行分段,则最好在进入网关处的下一个网络时进行分段,因为只有此网关(而不是其他网络)必须知道需要进行分段的内部数据包大小参数。
If fragmentation must be done, it appears best to do it upon entering the next network at the GATEWAY since only this GATEWAY (and not the other networks) must be aware of the internal packet size parameters which made the fragmentation necessary.
如果网关将传入数据包分段为两个或多个数据包,则它们最终必须作为分段传递到目标主机或为主机重新组装。可以想象,人们可能希望网关执行重组以简化目标主机(或进程)的任务和/或利用较大的数据包大小。我们认为网关不应执行此功能,因为网关重组可能导致严重的缓冲问题、潜在的死锁、数据包的所有片段的必要性通过同一个GATEWAY,增加了传输时延。此外, GATEWAY提供此功能是不够的,因为最终的GATEWAY还可能必须对数据包进行分段以进行传输。因此,目标主机必须准备好执行此任务。
If a GATEWAY fragments an incoming packet into two or more packets, they must eventually be passed along to the destination HOST as fragments or reassembled for the HOST. It is conceivable that one might desire the GATEWAY to perform the reassembly to simplify the task of the destination HOST (or process) and/or to take advantage of the larger packet size. We take the position that GATEWAYS should not perform this function since GATEWAY reassembly can lead to serious buffering problems, potential deadlocks, the necessity for all fragments of a packet to pass through the same GATEWAY, and increased delay in transmission. Furthermore, it is not sufficient for the GATEWAY to provide this function since the final GATEWAY may also have to fragment a packet for transmission. Thus the destination HOST must be prepared to do this task.
现在让我们简单地讨论一下当数据包可能被一个或多个GATEWAY分段时出现的有些不寻常的记账效应。为简单起见,我们假设每个网络最初对每个传输的数据包收取固定费率,无论距离如何,如果一个网络可以处理比另一个网络更大的数据包,则它对每个数据包收取的价格成比例更高。我们还假设任何网络数据包大小的后续增加不会导致用户为每个数据包增加额外成本。因此,通过任何必须对数据包进行分段的网络,用户的费用基本保持不变。当数据包被分割成更小的数据包时,就会出现这种不寻常的效果,这些数据包必须单独通过数据包大小比原始未分段数据包更大的后续网络。我们预计大多数网络自然会选择彼此接近的数据包大小,但无论如何,一个网络中数据包大小的增加,即使会导致碎片,也不会增加传输成本,实际上可能会降低传输成本。如果采用任何其他数据包计费策略(而不是我们建议的策略),则成本差异可以用作优化单个网络性能的经济杠杆。
Let us now turn briefly to the somewhat unusual accounting effect which arises when a packet may be fragmented by one or more GATEWAY. We assume, for simplicity, that each network initially charges a fixed rate per packet transmitted, regardless of distance, and if one network can handle a larger packet size than another, it charges a proportionally larger price per packet. We also assume that a subsequent increase in any network’s packet size does not result in additional cost per packet to its users. The charge to a user thus remains basically constant through any net which must fragment a packet. The unusual effect occurs when a packet is fragmented into smaller packets which must individually pass through a subsequent network with a larger packet size than the original unfragmented packet. We expect that most networks will naturally select packet sizes close to one another, but in any case, an increase in packet size in one net, even when it causes fragmentation, will not increase the cost of transmission and may actually decrease it. In the event that any other packet charging policies (than the one we suggest) are adopted, differences in cost can be used as an economic lever toward optimization of individual network performance.
我们假设进程希望使用无限但有限长度的消息与其通信方进行全双工通信。单个字符可能构成从进程到终端的消息文本,反之亦然。一整页字符可能构成从文件到进程的消息文本。数据流(例如,连续生成的位串)可以表示为有限长度消息的序列。
We suppose that processes wish to communicate in full duplex with their correspondents using unbounded but finite length messages. A single character might constitute the text of a message from a process to a terminal or vice versa. An entire page of characters might constitute the text of a message from a file to a process. A data stream (e.g. a continuously generated bit string) can be represented as a sequence of finite length messages.
在主机内,我们假设存在一个传输控制程序(TCP),它代表它所服务的进程处理消息的传输和接受。TCP 又由连接到TCP 所在主机的一个或多个分组交换机提供服务。想要与 TCP 进行通信的进程将消息呈现给 TCP 进行传输,而 TCP 将传入消息传送到适当的目标进程。我们允许 TCP 将消息分成段,因为目的地可能会限制可能到达的数据量,因为本地网络可能会限制最大传输大小,或者因为 TCP 可能需要在许多进程之间同时共享其资源。此外,我们将段的长度限制为整数个 8 位字节。这种一致性对于简化不同自然字长的主机所需的软件非常有帮助。可以在进程级别进行规定,以填充不是整数字节的消息,并识别到达的文本字节中的哪些包含接收进程感兴趣的信息。
Within a HOST we assume that existence of a transmission control program (TCP) which handles the transmission and acceptance of messages on behalf of the processes it serves. The TCP is in turn served by one or more packet switches connected to the HOST in which the TCP resides. Processes that want to communicate present messages to the TCP for transmission, and TCP’s deliver incoming messages to the appropriate destination processes. We allow the TCP to break up messages into segments because the destination may restrict the amount of data that may arrive, because the local network may limit the maximum transmission size, or because the TCP may need to share its resources among many processes concurrently. Furthermore, we constrain the length of a segment to an integral number of 8-bit bytes. This uniformity is most helpful in simplifying the software needed with HOST machines of different natural word lengths. Provision at the process level can be made for padding a message that is not an integral number of bytes and for identifying which of the arriving bytes of text contain information of interest to the receiving process.
进程间段的复用和解复用是 TCP 的基本任务。在传输时,TCP 必须将来自不同源进程的段复用在一起并生成互联网数据包以传送到其服务数据包交换机之一。接收时,TCP 将从其服务数据包交换机接受一系列数据包。根据此到达数据包序列(通常来自不同的主机),TCP 必须能够重建消息并将消息传递到正确的目标进程。
Mutliplexing and demultiplexing of segments among processes are fundamental tasks of the TCP. On transmission, a TCP must multiplex together segments from different source processes and produce internetwork packets for delivery to one of its serving packet switches. On reception, a TCP will accept a sequence of packets from its serving packet switch(es). From this sequence of arriving packets (generally from different HOSTS), the TCP must be able to reconstruct and deliver messages to the proper destination processes.
我们假设每个段都增加了额外的信息,允许传输和接收 TCP 分别识别目标和源进程。此时,我们必须面对一个重大问题。源 TCP 应该如何格式化发往同一目标 TCP 的分段?我们考虑两种情况。
We assume that every segment is augmented with additional information that allows transmitting and receiving TCP’s to identify destination and source processes, respectively. At this point, we must face a major issue. How should the source TCP format segments destined for the same destination TCP? We consider two cases.
情况 1)如果我们认为段边界并不重要,并且字节流可以由发往同一 TCP 的段组成,那么我们可以通过将流任意打包为数据包(允许多个段)来提高传输效率和资源共享共享单个互联网数据包标头。然而,这个位置导致需要准确地、按顺序地重建由源 TCP 生成的文本字节流。在目的地,必须首先将该流解析为段,然后必须使用这些段来重建消息以传递到适当的进程。由于数据包可能无序到达目的地,因此存在与此策略相关的基本问题。最关键的问题似乎是共享相同 TCP-TCP 字节流的进程之间可能造成的干扰量。在接收端尤其如此。首先,TCP 可能会遇到一些麻烦,将流解析回段,然后将它们分发到重新组装消息的缓冲区。如果不是很明显所有数据段都已到达(请记住,它可能以多个数据包的形式出现),则接收 TCP 可能必须暂时暂停解析,直到更多数据包到达为止。其次,如果数据包丢失,则可能不清楚后续的数据段(即使它们是可识别的)是否可以传递到接收进程,除非 TCP 了解某些进程级排序方案。这些信息将允许 TCP 决定是否可以将后续数据段传递给其等待进程。当字节流中存在间隙时找到段的开头也可能很困难。
Case 1) If we take the position that segment boundaries are immaterial and that a byte stream can be formed of segments destined for the same TCP, then we may gain improved transmission efficiency and resource sharing by arbitrarily parceling the stream into packets, permitting many segments to share a single internetwork packet header. However, this position results in the need to reconstruct exactly, and in order, the stream of text bytes produced by the source TCP. At the destination, this stream must first be parsed into segments and these in turn must be used to reconstruct messages for delivery to the appropriate processes. There are fundamental problems associated with this strategy due to the possible arrival of packets out of order at the destination. The most critical problem appears to be the amount of interference that processes sharing the same TCP–TCP byte stream may cause among themselves. This is especially so at the receiving end. First, the TCP may be put to some trouble to parse the stream back into segments and then distribute them to buffers where messages are reassembled. If it is not readily apparent that all of a segment has arrived (remember, it may come as several packets), the receiving TCP may have to suspend parsing temporarily until more packets have arrived. Second, if a packet is missing, it may not be clear whether succeeding segments, even if they are identifiable, can be passed on to the receiving process, unless the TCP has knowledge of some process level sequencing scheme. Such knowledge would permit the TCP to decide whether a succeeding segment could be delivered to its waiting process. Finding the beginning of a segment when there are gaps in the byte stream may also be hard.
情况 2)或者,我们可能会采取这样的立场:目标 TCP 在到达时应该能够在没有附加信息的情况下确定接收到的数据包是用于哪个或哪些进程,如果是,则是否应该将其传送。
Case 2) Alternatively, we might take the position that the destination TCP should be able to determine, upon its arrival and without additional information, for which process or processes a received packet is intended, and if so, whether it should be delivered then.
如果 TCP 要确定到达的数据包要发送给哪个进程,则每个数据包必须包含一个完全标识目标进程的进程标头(与互连网络标头不同)。为简单起见,我们假设每个数据包包含来自单个进程的文本,该文本的目的地是单个进程。因此,每个数据包只需要包含一个进程标头。为了决定到达的数据是否可以传送到目标进程,TCP 必须能够确定数据是否处于正确的顺序(我们可以为目标进程提供规定,指示其 TCP 忽略排序,但这被认为是一个错误)特例)。假设每个到达的数据包都包含进程标头,则必要的排序和目标进程标识可立即供目标 TCP 使用。
If the TCP is to determine for which process an arriving packet is intended, every packet must contain a process header (distinct from the internetwork header) that completely identifies the destination process. For simplicity, we assume that each packet contains text from a single process which is destined for a single process. Thus each packet need contain only one process header. To decide whether the arriving data is deliverable to the destination process, the TCP must be able to determine whether the data is in the proper sequence (we can make provision for the destination process to instruct its TCP to ignore sequencing, but this is considered a special case). With the assumption that each arriving packet contains a process header, the necessary sequencing and destination process identification is immediately available to the destination TCP.
情况 1) 和 2) 都提供了将段解复用和传送到目标进程的功能,但只有情况 2) 这样做时不会引入潜在的进程间干扰。此外,情况 1) 引入了额外的机制来处理HOST到HOST基础上的流量控制,因为还必须有一些流程级别控制的规定,并且这种机制很少使用,因为在给定的HOST内的概率很小,两个进程会同时安排向同一目标HOST发送消息。为此,我们选择案例2)的方法作为网际传输协议的一部分。
Both Cases 1) and 2) provide for the demultiplexing and delivery of segments to destination processes, but only Case 2) does so without the introduction of potential interprocess interference. Furthermore, Case 1) introduces extra machinery to handle flow control on a HOST-to-HOST basis, since there must also be some provision for process level control, and this machinery is little used since the probability is small that within a given HOST, two processes will be coincidentally scheduled to send messages to the same destination HOST. For this reason, we select the method of Case 2) as a part of the internetwork transmission protocol.
地址格式的选择是网络之间的一个问题,因为 TCP 的本地网络地址在格式和大小上可能有很大差异。每个网关和 TCP 都能理解的统一的互联网络 TCP 地址空间对于互联网络数据包的路由和传送至关重要。当我们处理进程寻址以及更一般的端口寻址时,也会遇到类似的问题。我们引入端口的概念是为了允许进程区分多个消息流。端口只是与进程关联的一个此类消息流的指示符。不同操作系统中识别端口的方式通常不同,因此,为了获得统一的寻址,还需要标准的端口地址格式。端口地址指定全双工消息流。
The selection of address formats is a problem between networks because the local network addresses of TCP’s may vary substantially in format and size. A uniform internetwork TCP address space, understood by each GATEWAY and TCP, is essential to routing and delivery of internetwork packets. Similar troubles are encountered when we deal with process addressing and, more generally, port addressing. We introduce the notion of ports in order to permit a process to distinguish between multiple message streams. The port is simply a designator of one such message stream associated with a process. The means for identifying a port are generally different in different operating systems, and therefore, to obtain uniform addressing, a standard port address format is also required. A port address designates a full duplex message stream.
TCP 寻址与路由问题密切相关,因为主机或网关必须为传出的互联网络数据包选择合适的目标主机或网关。让我们假设 TCP 地址采用以下地址格式(图 38.4)。网络标识(8 位)的选择允许最多 256 个不同的网络。这个尺寸对于可预见的未来来说似乎足够了。类似地,TCP 标识符字段允许寻址最多 65,536 个不同的 TCP,这对于任何给定的网络来说似乎都绰绰有余。
TCP addressing is intimately bound up in routing issues, since a HOST or GATEWAY must choose a suitable destination HOST or GATEWAY for an outgoing internetwork packet. Let us postulate the following address format for the TCP address (Figure 38.4). The choice for network identification (8 bits) allows up to 256 distinct networks. This size seems sufficient for the foreseeable future. Similarly, the TCP identifier field permits up to 65,536 distinct TCP’s to be addressed, which seems more than sufficient for any given network.
图 38.4: TCP 地址
Figure 38.4: TCP address
当每个数据包通过GATEWAY时,GATEWAY会观察目标网络 ID 以确定如何路由数据包。如果目标网络连接到GATEWAY,则 TCP 地址的低 16 位用于生成目标网络中的本地 TCP 地址。如果目标网络没有连接到GATEWAY,则高8 位用于选择后续的GATEWAY。我们没有努力指定每个单独的网络如何将互联网络 TCP 标识符与其本地 TCP 地址相关联。我们也不排除本地网络理解互联网络寻址方案并从而减轻网关的路由责任的可能性。
As each packet passes through a GATEWAY, the GATEWAY observes the destination network ID to determine how to route the packet. If the destination network is connected to the GATEWAY, the lower 16 bits of the TCP address are used to produce a local TCP address in the destination network. If the destination network is not connected to the GATEWAY, the upper 8 bits are used to select a subsequent GATEWAY. We make no effort to specify how each individual network shall associate the internetwork TCP identifier with its local TCP address. We also do not rule out the possibility that the local network understands the internetwork addressing scheme and thus alleviates the GATEWAY of the routing responsibility.
接收 TCP 面临着解复用其接收的网络数据包流并为每个目标进程重建原始消息的任务。每个操作系统都有自己的识别进程和端口的内部方法。我们假设 16 位足以用作互联网络端口标识符。发送进程不需要知道如何使用目的地端口标识。目标 TCP 将能够适当地解析该数字,以找到将到达的数据包放入其中的正确缓冲区。我们允许使用大端口号字段来支持想要同时区分许多不同消息流的进程。实际上,我们并不关心 16 位是如何被所涉及的 TCP 分割的。
A receiving TCP is faced with the task of demultiplexing the stream of internetwork packets it receives and reconstructing the original messages for each destination process. Each operating system has its own internal means of identifying processes and ports. We assume that 16 bits are sufficient to serve as internetwork port identifiers. A sending process need not know how the destination port identification will be used. The destination TCP will be able to parse this number appropriately to find the proper buffer into which it will place arriving packets. We permit a large port number field to support processes which want to distinguish between many different message streams concurrently. In reality, we do not care how the 16 bits are sliced up by the TCP’s involved.
尽管传输的端口名称字段很大,但它仍然是端口内部表示的紧凑外部名称。通常需要使用端口标识符的短名称来减少传输开销并可能减少目标 TCP 处的数据包处理时间。然而,为每个端口分配短名称需要源和目的地之间进行初始协商以就合适的短名称分配达成一致,随后在源和目的地处维护转换表,以及释放短名称的最终事务。对于端口名称的动态分配,这种协商在任何情况下通常都是必要的。
Even though the transmitted port name field is large, it is still a compact external name for the internal representation of the port. The use of short names for port identifiers is often desirable to reduce transmission overhead and possibly reduce packet processing time at the destination TCP. Assigning short names to each port, however, requires an initial negotiation between source and destination to agree on a suitable short name assignment, the subsequent maintenance of conversion tables at both the source and the destination, and a final transaction to release the short name. For dynamic assignment of port names, this negotiation is generally necessary in any case.
如图38.5所示,消息被 TCP 分成段,其格式在图 38.6中更详细地显示。所示的字段长度仅是建议性的。前两个字段(图中的源端口和目标端口)已在前面的寻址部分中讨论过。第三和第四字段的使用(窗口和确认)该图)将在稍后的重传和重复检测部分中讨论。我们从图 38.3中回想起,网络标头包含序列号和字节计数,以及标志字段和校验和。这些字段的用途将在下一节中解释。
As shown in Figure 38.5, messages are broken by the TCP into segments whose format is shown in more detail in Figure 38.6. The field lengths illustrated are merely suggestive. The first two fields (source port and destination port in the figure) have already been discussed in the preceding section on addressing. The uses of the third and fourth fields (window and acknowledgement in the figure) will be discussed later in the section on retransmission and duplicate detection. We recall from Figure 38.3 that an internetwork header contains both a sequence number and a byte count, as well as a flag field and a check sum. The uses of these fields are explained in the following section.
图 38.5: 从消息创建段和数据包
Figure 38.5: Creation of segments and packets from messages
图 38.6: 段格式(进程标题和文本)
Figure 38.6: Segment format (process header and text)
在接收 TCP 处重建消息显然要求每个互联网络数据包携带一个对其特定目标端口消息流而言唯一的序列号。序列号必须单调递增(或递减),因为它们用于将到达的数据包重新排序和重组为消息。如果序列号的空间是无限的,我们可以简单地将下一个序列号分配给每个新数据包。显然,这个空间不可能是无限的,我们在下一节讨论重传和重复检测时会考虑有限的序列号空间会带来什么问题。我们提出以下方案来执行数据包排序,从而由目标 TCP 重建消息。
The reconstruction of a message at the receiving TCP clearly requires that each internetwork packet carry a sequence number which is unique to its particular destination port message stream. The sequence numbers must be monotonic increasing (or decreasing) since they are used to reorder and reassemble arriving packets into a message. If the space of sequence numbers were infinite, we could simply assign the next one to each new packet. Clearly, this space cannot be infinite, and we will consider what problems a finite sequence number space will cause when we discuss retransmission and duplicate detection in the next section. We propose the following scheme for performing the sequencing of packets and hence the reconstruction of messages by the destination TCP.
一对端口将在一段时间内交换一个或多个消息。我们可以查看一个端口产生的消息序列,就好像它嵌入在无限长的字节流中一样。消息的每个字节都有一个唯一的序列号,我们将其作为相对于流开头的字节位置。当源 TCP 从消息中提取分段并格式化以进行互联网传输时,分段文本的第一个字节的相对位置将用作数据包的序列号。互联网络标头中的字节计数字段占该段中的所有文本(但不包括校验和字节或互联网络或进程标头中的字节)。我们强调,与给定数据包关联的序列号仅对于正在通信的一对端口是唯一的(参见图 38.7)。检查到达的数据包以确定它们要发送到哪个端口。然后,使用每个到达数据包上的序列号来确定数据包文本在重建的消息中的相对位置。我们注意到,即使片段仍然丢失,这也可以确定重建消息中数据的确切位置。
A pair of ports will exchange one or more messages over a period of time. We could view the sequence of messages produced by one port as if it were embedded in an infinitely long stream of bytes. Each byte of the message has a unique sequence number which we take to be its byte location relative to the beginning of the stream. When a segment is extracted from the message by the source TCP and formatted for internetwork transmission, the relative location of the first byte of segment text is used as the sequence number for the packet. The byte count field in the internetwork header accounts for all the text in the segment (but does not include the check-sum bytes or the bytes in either internetwork or process header). We emphasize that the sequence number associated with a given packet is unique only to the pair of ports that are communicating (see Figure 38.7). Arriving packets are examined to determine for which port they are intended. The sequence numbers on each arriving packet are then used to determine the relative location of the packet text in the messages under reconstruction. We note that this allows the exact position of the data in the reconstructed message to be determined even when pieces are still missing.
图 38.7: 序列号的分配
Figure 38.7: Assignment of sequence numbers
源 TCP 生成的每个段都打包在单个互联网数据包中,并根据与该段关联的文本和进程标头计算校验和。TCP 将消息分割成段以及网关可能将段分割成更小的片段,因此有必要向目标 TCP 指示消息何时结束数据段 (ES) 已到达,并且消息末尾 (EM) 已到达。互联网络标头的标志字段用于此目的(参见图 38.8)。
Every segment produced by a source TCP is packaged in a single internetwork packet and a check sum is computed over the text and process header associated with the segment. The splitting of messages into segments by the TCP and the potential splitting of segments into smaller pieces by GATEWAYS creates the necessity for indicating to the destination TCP when the end of a segment (ES) has arrived and when the end of a message (EM) has arrived. The flag field of the internetwork header is used for this purpose (see Figure 38.8).
图 38.8: 互联网络标头标志字段
Figure 38.8: Internetwork header flag field
ES 标志由源 TCP 在每次准备传输数据段时设置。如果消息完全包含在该段中,则 EM 标志也将被设置。如果消息不能包含在一个段中,则还会在消息的最后一段上设置 EM 标志。目标 TCP 分别使用这两个标志来发现给定段的校验和是否存在以及发现完整的消息已到达。
The ES flag is set by the source TCP each time it prepares a segment for transmission. If it should happen that the message is completely contained in the segment, then the EM flag would also be set. The EM flag is also set on the last segment of a message, if the message could not be contained in one segment. These two flags are used by the destination TCP, respectively, to discover the presence of a check sum for a given segment and to discover that a complete message has arrived.
网关知道互连网络标头中的 ES 和 EM 标志,并且当必须将数据包拆分以便通过下一个本地网络传播时,它们特别重要。我们通过图 38.9中的示例来说明它们的使用。
The ES and EM flags in the internetwork header are known to the GATEWAY and are of special importance when packets must be split apart from propagation through the next local network. We illustrate their use with an example in Figure 38.9.
图38.9: 消息分割和数据包分割
Figure 38.9: Message splitting and packet splitting
图 38.9中的原始消息A被分为两个段A 1和A 2,并由 TCP 格式化为一对互联网数据包。分组A 1和A 2设置了其ES 位,并且A 2也设置了其EM 位。当分组A 1通过GATEWAY时,它被分成两部分:EM位和ES位都没有被设置的分组A 11 ,以及ES位被设置的分组A 12 。类似地,分组A 2被分割,使得第一部分分组A 21没有设置任何位,但是分组A 22具有两个位设置。每个数据包的序列号字段 (SEQ) 和字节计数字段 (CT) 由网关修改,以正确识别每个数据包的文本字节。网关只需要检查互联网络标头即可进行分段。
The original message A in Figure 38.9 is shown split into two segments A1 and A2 and formatted by the TCP into a pair of internetwork packets. Packets A1 and A2 have their ES bits set, and A2 has its EM bit set as well. When packet A1 passes through the GATEWAY, it is split into two pieces: packet A11 for which neither EM nor ES bits are set, and packet A12 whose ES bit is set. Similarly, packet A2 is split such that the first piece, packet A21, has neither bit set, but packet A22 has both bits set. The sequence number field (SEQ) and the byte count field (CT) of each packet is modified by the GATEWAY to properly identify the text bytes of each packet. The GATEWAY need only examine the internetwork header to do fragmentation.
目的地TCP在重组段A 1时,将检测ES标志并且将验证其知道包含在分组A 12中的校验和。一旦接收到分组A 22,假设所有其他分组都已经到达,目的地TCP检测到它已经重组了完整的消息并且现在可以通知目的地进程其接收。
The destination TCP, upon reassembling segment A1, will detect the ES flag and will verify the check sum it knows is contained in packet A12. Upon receipt of packet A22, assuming all other packets have arrived, the destination TCP detects that it has reassembled a complete message and can now advise the destination process of its receipt.
任何传输都不可能是 100% 可靠的。我们提出了一种超时和肯定确认机制,该机制将允许 TCP 从一个主机到另一个主机的数据包丢失中恢复。TCP 传输数据包并等待反向数据包流中携带的答复(确认)。如果没有收到特定数据包的确认,TCP 将重新传输。我们期望在实践中不会经常调用以下段落中描述的HOST级重传机制。已有证据表明(Pouzin,1973b),即使没有此功能,也可以有效地构建单个网络。然而,包含HOST重传功能可以从偶尔的网络问题中恢复,并允许多种HOST协议策略被纳入。我们预计它偶尔会被调用,以允许主机满足对有限缓冲区资源的不频繁的过度需求,否则不会被太多使用。
No transmission can be 100 percent reliable. We propose a timeout and positive acknowledgement mechanism which will allow TCP’s to recover from packet losses from one HOST to another. A TCP transmits packets and waits for replies (acknowledgements) that are carried in the reverse packet stream. If no acknowledgement for a particular packet is received, the TCP will retransmit. It is our expectation that the HOST level retransmission mechanism, which is described in the following paragraphs, will not be called upon very often in practice. Evidence already exists (Pouzin, 1973b) that individual networks can be effectively constructed without this feature. However, the inclusion of a HOST retransmission capability makes it possible to recover from occasional network problems and allows a wide range of HOST protocol strategies to be incorporated. We envision it will occasionally be invoked to allow HOST accommodation to infrequent overdemands for limited buffer resources, and otherwise not used much.
任何重传策略都需要某种方法,接收方可以通过这些方法检测重复的到达。即使有无限数量的不同数据包序列号可用,接收器仍然存在知道要记住先前接收到的数据包多长时间以检测重复数据包的问题。事实上,只有有限数量的不同序列号可用,并且如果重复使用它们,接收器必须能够区分新传输和重传,这一事实使事情变得复杂。
Any retransmission policy requires some means by which the receiver can detect duplicate arrivals. Even if an infinite number of distinct packet sequence numbers were available, the receiver would still have the problem of knowing how long to remember previously received packets in order to detect duplicates. Matters are complicated by the fact that only a finite number of distinct sequence numbers are in fact available, and if they are reused, the receiver must be able to distinguish between new transmissions and retransmissions.
这里提出了一种窗口策略,类似于法国CYCLADES系统(voie Virdlle传输模式 [Chambon 等人,1973])和 ARPA NET远距离主机连接(BBN,1973)所使用的策略(见图38.10)。
A window strategy, similar to that used by the French CYCLADES system (voie virtuelle transmission mode [Chambon et al., 1973]) and the ARPANET very distant HOST connection (BBN, 1973), is proposed here (see Figure 38.10).
图 38.10: 窗口概念
Figure 38.10: The window concept
假设互联网络标头中的序列号字段允许序列号的范围为 0 到n - 1。我们假设发送方在未收到确认的情况下不会传输超过w字节。w字节用作窗口(见图38.11 )。显然,w必须小于n。发送方和接收方的规则如下。
Suppose that the sequence number field in the internetwork header permits sequence numbers to range from 0 to n − 1. We assume that the sender will not transmit more than w bytes without receiving an acknowledgment. The w bytes serve as the window (see Figure 38.11). Clearly, w must be less than n. The rules for sender and receiver are as follows.
图 38.11: 概念 TCB 格式 [编辑:“TCB”= 发送控制块。]
Figure 38.11: Conceptual TCB format [EDITOR: “TCB” = transmit control block.]
发送者:令L为与左窗口边缘相关的序列号。
Sender: Let L be the sequence number associated with the left window edge.
1. 发送方从文本位于L到L + w − 1 之间的段中传输字节。
1. The sender transmits bytes from segments whose text lies between L and up to L + w − 1.
2. 超时(未指定持续时间)时,发送方重新传输未确认的字节。
2. On timeout (duration unspecified), the sender retransmits unacknowledged bytes.
3. 收到包含接收方当前左窗口边缘的确认后,发送方的左窗口边缘将比已确认的字节提前(隐式地提前右窗口边缘)。
3. On receipt of acknowledgment consisting of the receiver’s current left window edge, the sender’s left window edge is advanced over the acknowledged bytes (advancing the right window edge implicitly).
接收者:
Receiver:
1. 通过向源发送下一个预期的序列号来确认序列号与接收器当前左窗口边缘一致的到达数据包。这有效地确认了之间的字节。左窗口边缘前进到预期的下一个序列号。
1. Arriving packets whose sequence numbers coincide with the receiver’s current left window edge are acknowledged by sending to the source the next sequence number expected. This effectively acknowledges bytes in between. The left window edge is advanced to the next sequence number expected.
2. 带有位于窗口边缘左侧(或者实际上位于窗口外部)的序列号的数据包将被丢弃,并且返回当前左窗口边缘作为确认。
2. Packets arriving with a sequence number to the left of the window edge (or, in fact, outside of the window) are discarded, and the current left window edge is returned as acknowledgement.
3. 序列号位于接收器窗口内但与接收器左窗口边缘不重合的数据包可选择保留或丢弃,但现在已确认。当数据包无序到达时就会出现这种情况。……
3. Packets whose sequence numbers lie within the receiver’s window but do not coincide with the receiver’s left window edge are optionally kept or discarded, but are now acknowledged. This is the case when packets arrive out of order. …
到达目标 TCP 的每个数据段最终都会通过返回必须传递给进程的下一个数据段的序列号(可能尚未到达)来进行确认。
Every segment that arrives at the destination TCP is ultimately acknowledged by returning the sequence number of the next segment which must be passed to the process (it may not yet have arrived).
之前我们描述了使用序列号空间和窗口来帮助重复检测。确认在进程标头中携带(参见图 38.6),并且与它们一起提供了“建议窗口”,接收方可以使用该窗口来控制来自发送方的数据流。这是流程控制机制的主要组成部分。接收方可以根据其所需的任何算法自由改变窗口大小,只要窗口大小不超过序列号空间的一半即可。
Earlier we described the use of a sequence number space and window to aid in duplicate detection. Acknowledgments are carried in the process header (see Figure 38.6) and along with them there is provision for a “suggested window” which the receiver can use to control the flow of data from the sender. This is intended to be the main component of the process flow control mechanism. The receiver is free to vary the window size according to any algorithm it desires so long as the window size never exceeds half the sequence number space.
这种流量控制机制非常强大和灵活,并且不会遇到增量缓冲区分配方案可能遇到的同步问题(Carr 等人,1970;McKenzie,1972)。然而,它在很大程度上依赖于有效的重传策略。即使数据包正在从当前窗口较大的发送方发送出去,接收方也可以减小窗口。这种减少的最终效果将是接收器可能会丢弃传入的数据包(它们可能在窗口之外)并重申当前窗口大小以及当前窗口边缘作为确认。出于同样的原因,发送方有时可以选择发送超过一个窗口的数据,因为接收方可能会扩展窗口来接受它(当然,发送方不得发送超过一半的序列号)随时有空间)。通常,我们希望发送者遵守窗口限制。接收器扩展窗口仅允许接受更多数据。……
This flow control mechanism is exceedingly powerful and flexible and does not suffer from synchronization troubles that may be encountered by incremental buffer allocation schemes (Carr et al., 1970; McKenzie, 1972). However, it relies heavily on an effective retransmission strategy. The receiver can reduce the window even while packets are en route from the sender whose window is presently larger. The net effect of this reduction will be that the receiver may discard incoming packets (they may be outside the window) and reiterate the current window size along with a current window edge as acknowledgment. By the same token, the sender can, upon occasion, choose to send more than a window’s worth of data on the possibility that the receiver will expand the window to accept it (of course, the sender must not send more than half the sequence number space at any time). Normally, we would expect the sender to abide by the window limitation. Expansion of the window by the receiver merely allows more data to be accepted. …
经电气和电子工程师协会许可,转载自 Cerf 和 Kahn (1974)。
Reprinted from Cerf and Kahn (1974), with permission from the Institute of Electrical and Electronics Engineers.
从 20 世纪 60 年代中期到 1970 年代,随着计算机科学家的野心变得更加奢侈,软件项目变得越来越大,错误变得更加微妙,不止一次花费数百万美元的软件系统根本无法运行,不得不被废弃。被丢弃。一场“软件危机”被宣布,并催生了通过限制程序员的言论自由来管理复杂性的举措。采用了语言功能,并省略了其他功能,以鼓励(或强制)程序员隐藏实现其高级编程抽象的机器代码的复杂性。这些努力遵循两条路径,一条是控制程序流,另一条是数据操作。在控制方面,软件要模块化,控制结构仅限于一些循环和子例程原语(请参阅第 29 章,了解 Dijkstra 对“ go to ”语句的攻击,以及 Dijkstra (1972) 对结构化编程哲学的完整阐述)。另一条路径(本文是其中的早期范例)是采用语言约定,仅允许表达对数据的最小必要功能操作,而不是数据的内部结构。这种“非常高级”的编程语言现在已经成为相当标准的语言。面向对象的编程范式的根源就在这里。
From the mid-1960s through the 1970s, as the ambition of computer scientists became more extravagant, software projects became larger, bugs became more subtle, and more than once a software system costing millions of dollars didn’t work at all and had to be discarded. A “software crisis” was declared, and initiatives were spawned to manage complexity by constraining programmers’ freedom of expression. Language features were adopted, and others omitted, in order to encourage (or coerce) programmers to hide the complexity of the machine code that implemented their higher-level programming abstractions. These efforts followed two paths, one on control of program flow and the other on the manipulation of data. On the control side, software was to be modularized and control structures limited to a few looping and subroutining primitives (see chapter 29 for Dijkstra’s attack on the “go to” statement, and Dijkstra (1972) for a full articulation of the structured programming philosophy). The other path, of which this paper is an early exemplar, was to adopt language conventions that allow expression only of the minimum necessary functional operations on data, not the internal structure of the data. This sort of “very high level” programming language has now become fairly standard; the object-oriented programming paradigm traces its roots to here.
芭芭拉·利斯科夫(Barbara Liskov,生于 1939 年)于 1968 年在约翰·麦卡锡 (John McCarthy) 的指导下获得斯坦福大学博士学位;她是该领域第一批女性博士之一。从那时起,她的大部分职业生涯都在麻省理工学院度过,她是麻省理工学院的最高教职级别的研究所教授。她于 2008 年获得图灵奖,部分原因是她在数据抽象方面的工作,尽管她还广泛致力于分布式计算和容错计算方面的问题。Stephen Zilles 在撰写本文时是 MIT 的研究生,现已从 IBM 和 Adobe 的职业生涯中退休。
Barbara Liskov (b. 1939) received her PhD from Stanford under the direction of John McCarthy in 1968; she was one of the first female PhDs in the field. She has spent most of her career since then at MIT, where she is Institute Professor, MIT’s highest faculty rank. She received the Turing Award in 2008, in part for her work on data abstraction, though she has also worked extensively on problems in distributed computing and fault-tolerant computing. Stephen Zilles, a graduate student at MIT when this paper was written, is now retired from a career at IBM and Adobe.
超高级语言工作背后的动机是通过为程序员提供一种包含适合其问题领域的原语或抽象的语言来简化编程任务。然后程序员就可以把精力花在正确的地方;他专注于解决他的问题,因此生成的程序将更加可靠。显然,这是一个值得实现的目标。
THE motivation behind the work in very-high-level languages is to ease the programming task by providing the programmer with a language containing primitives or abstractions suitable to his problem area. The programmer is then able to spend his effort in the right place; he concentrates on solving his problem, and the resulting program will be more reliable as a result. Clearly, this is a worthwhile goal.
不幸的是,设计者很难提前选择其语言的用户可能需要的所有抽象。如果要使用一种语言,它很可能会被用来解决其设计者没有预见到的问题,并且语言中嵌入的抽象不足以解决这些问题。
Unfortunately, it is very difficult for a designer to select in advance all the abstractions which the users of his language might need. If a language is to be used at all, it is likely to be used to solve problems which its designer did not envision, and for which the abstractions embedded in the language are not sufficient.
本文提出了一种方法,当发现需要新的数据抽象时,可以增强内置抽象集。这种处理抽象的方法是设计结构化编程语言的成果。描述了该语言的相关方面,并给出了抽象的使用和定义的示例。
This paper presents an approach which allows the set of built-in abstractions to be augmented when the need for a new data abstraction is discovered. This approach to the handling of abstraction is an outgrowth of work on designing a language for structured programming. Relevant aspects of this language are described, and examples of the use and definitions of abstractions are given.
本文描述了一种计算机抽象表示方法。该方法是在设计支持结构化编程的语言时开发的,也与非常高级语言的工作相关。我们首先解释其相关性,并比较结构化编程和非常高级语言的工作。
This paper describes an approach to computer representation of abstraction. The approach, developed while designing a language to support structured programming, is also relevant to work in very-high-level languages. We begin by explaining its relevance and by comparing work in structured programming and very-high-level languages.
结构化编程的目的是增强程序的可靠性和可理解性。超高级语言虽然主要旨在通过减轻程序员的任务来提高程序员的工作效率,但也可以提高代码的可靠性和可理解性。因此,这两个领域的工作预计会带来类似的好处。
The purpose of structured programming is to enhance the reliability and understandability of programs. Very-high-level languages, while primarily intended to increase programmer productivity by easing the programmer’s task, can also be expected to enhance the reliability and understandability of code. Thus, similar benefits can be expected from work in the two areas.
然而,这两个领域的工作却沿着不同的方向进行。非常高级的语言试图向用户呈现对其应用领域有用的抽象(操作、数据结构和控制结构)。用户可以使用这些抽象而不必关心它们是如何实现的——他只关心它们做什么。因此,他能够忽略与他的应用领域无关的细节,并专注于解决他的问题。
Work in the two areas, however, proceeds along different lines. A very-high-level language attempts to present the user with the abstractions (operations, data structures, and control structures) useful to his application area. The user can use these abstractions without being concerned with how they are implemented—he is only concerned with what they do. He is thus able to ignore details not relevant to his application area, and to concentrate on solving his problem.
结构化编程试图对编程任务施加纪律,以便生成的程序“结构良好”。在该学科中,问题是通过连续分解的过程来解决的。第一步是编写一个解决问题的程序,但该程序在抽象机上运行,该抽象机仅提供非常适合解决问题的数据对象和操作。这些数据对象和操作中的一些或全部是真正抽象的,即,在所使用的编程语言中不作为原语出现。目前,我们将它们松散地组合在“抽象”一词下。
Structured programming attempts to impose a discipline on the programming task so that the resulting programs are “well-structured.” In this discipline, a problem is solved by means of a process of successive decomposition. The first step is to write a program which solves the problem but which runs on an abstract machine, one which provides just those data objects and operations which are ideally suited to solving the problem. Some or all of those data objects and operations are truly abstract, i.e., not present as primitives in the programming language being used. We will, for the present, group them loosely together under the term “abstraction.”
程序员最初关心的是让自己满意(或证明)他的程序正确地解决了问题。在这个分析中,他关心的是他的程序使用抽象的方式,但不关心如何实现这些抽象的任何细节。当他对程序的正确性感到满意时,他将注意力转向它使用的抽象。每个抽象代表一个新问题,需要额外的程序来解决。新程序也可以编写为在抽象机上运行,从而引入进一步的抽象。当构建程序过程中生成的所有抽象都被进一步的程序实现时,原来的问题就完全解决了。
The programmer is initially concerned with satisfying himself (or proving) that his program correctly solves the problem. In this analysis he is concerned with the way his program makes use of the abstractions, but not with any details of how those abstractions may be realized. When he is satisfied with the correctness of his program, he turns his attention to the abstractions it uses. Each abstraction represents a new problem, requiring additional programs for its solution. The new program may also be written to run on an abstract machine, introducing further abstractions. The original problem is completely solved when all abstractions generated in the course of constructing the program have been realized by further programs.
现在很清楚,非常高级语言和结构化编程的方法是彼此相关的:每一种方法都基于利用那些对于要解决的问题来说是正确的抽象的想法。此外,两种方法中使用抽象的基本原理是相同的:使程序员不必关心与他正在解决的问题无关的细节。
It is clear now that the approaches of very-high-level languages and structured programming are related to one another: each is based on the idea of making use of those abstractions which are correct for the problem being solved. Furthermore, the rationale for using the abstractions is the same in both approaches: to free the programmer from concern with details not relevant to the problem he is solving.
在非常高级的语言中,设计者试图提前识别有用的抽象集。另一方面,结构化编程语言不包含关于特定的有用抽象集的先入为主的概念,而是必须提供一种机制,通过该机制可以扩展该语言以包含用户所需的抽象。包含这种机制的语言可以被视为通用的、无限高级的语言。
In very-high-level languages, the designers attempt to identify the set of useful abstractions in advance. A structured programming language, on the other hand, contains no preconceived notions about the particular set of useful abstractions, but, instead, must provide a mechanism whereby the language can be extended to contain the abstractions which the user requires. A language containing such a mechanism can be viewed as a general-purpose, indefinitely-high-level language.
在本文中,我们描述了一种抽象方法,当发现新抽象的需要时,该方法允许增强内置抽象集。我们首先分析编写程序时使用的抽象,并确定数据抽象的需求。非正式地描述了支持数据抽象的使用和定义的语言,并给出了一些示例程序。本文的其余部分讨论了该方法与先前工作的关系以及该语言实现的某些方面。
In this paper we describe an approach to abstraction which permits the set of built-in abstractions to be augmented when the need for new abstractions is discovered. We begin by analyzing the abstractions used in writing programs, and identify the need for data abstractions. A language supporting the use and definition of data abstractions is informally described, and some example programs are given. Remaining sections of the paper discuss the relationship of the approach to previous work, and some aspects of the implementation of the language.
上一节中对结构化编程的描述很模糊,因为它是用“抽象”和“抽象机”等未定义的术语来表达的。在本节中,我们分析“抽象”的含义,以确定程序员需要什么类型的抽象,以及结构化编程语言如何支持这些要求。
The description of structured programming given in the preceding section is vague because it is couched in such undefined terms as “abstraction” and “abstract machine.” In this section we analyze the meaning of “abstraction” to determine what kinds of abstraction a programmer requires, and how a structured programming language can support these requirements.
我们希望从抽象中得到一种允许表达相关细节并抑制不相关细节的机制。在编程的情况下,抽象的使用是相关的;抽象的实现方式无关紧要。如果我们考虑传统的编程语言,我们会发现它们为抽象提供了强大的帮助:函数或过程。当程序员使用一个过程时,他(或应该)只关心它做什么——它为他提供什么功能。他不关心程序执行的算法。此外,过程还提供了一种分解问题的方法——在过程内执行部分编程任务,并在程序中执行调用该过程的另一部分。因此,过程的存在在很大程度上有助于捕捉抽象的含义。
What we desire from an abstraction is a mechanism which permits the expression of relevant details and the suppression of irrelevant details. In the case of programming, the use which may be made of an abstraction is relevant; the way in which the abstraction is implemented is irrelevant. If we consider conventional programming languages, we discover that they offer a powerful aid to abstraction: the function or procedure. When a programmer makes use of a procedure, he is (or should be) concerned only with what it does—what function it provides for him. He is not concerned with the algorithm executed by the procedure. In addition, procedures provide a means of decomposing a problem—performing part of the programming task inside a procedure, and another part in the program which calls the procedure. Thus, the existence of procedures goes quite far toward capturing the meaning of abstraction.
不幸的是,过程本身并不能提供足够丰富的抽象词汇。上述抽象机的抽象数据对象和控制结构并不能用独立的程序来准确表示。因为我们在结构化编程的背景下考虑抽象,所以我们将省略对控制抽象的讨论。
Unfortunately, procedures alone do not provide a sufficiently rich vocabulary of abstractions. The abstract data objects and control structures of the abstract machine mentioned above are not accurately represented by independent procedures. Because we are considering abstraction in the context of structured programming, we will omit discussion of control abstractions.
这引出了抽象数据类型的概念,它是语言设计的核心。抽象数据类型定义了一类抽象对象,其特征完全是对这些对象可用的操作。这意味着可以通过定义该类型的特征化操作来定义抽象数据类型。
This leads us to the concept of abstract data type which is central to the design of the language. An abstract data type defines a class of abstract objects which is completely characterized by the operations available on those objects. This means that an abstract data type can be defined by defining the characterizing operations for that type.
我们相信上述概念捕捉了抽象对象的基本属性。当程序员使用抽象数据对象时,他只关心该对象表现出的行为,而不关心如何通过实现来实现该行为的任何细节。对象的行为由一组表征操作捕获。仅当定义如何实现特征化操作时才需要实现信息,例如如何在存储中表示对象。对象的用户不需要知道或提供此信息。
We believe that the above concept captures the fundamental properties of abstract objects. When a programmer makes use of an abstract data object, he is concerned only with the behavior which that object exhibits but not with any details of how that behavior is achieved by means of an implementation. The behavior of an object is captured by the set of characterizing operations. Implementation information, such as how the object is represented in storage, is only needed when defining how the characterizing operations are to be implemented. The user of the object is not required to know or supply this information.
抽象类型与编程语言提供的内置类型非常相似。内置类型(例如整数或整数数组)的用户只关心创建该类型的对象,然后对它们执行操作。他(通常)不关心数据对象的表示方式,并且他将对象上的操作视为不可分割的和原子的,而实际上可能需要多个机器指令来执行这些操作。此外,他(通常)不被允许分解这些物体。例如,考虑内置类型integer。程序员想要声明整数类型的对象并对它们执行通常的算术运算。他通常对作为位串的整数对象不感兴趣,并且无法利用计算机字中的位格式。另外,他希望语言能够保护他免受类型的愚蠢误用(例如,向字符添加整数),或者通过将此类事情视为错误(强类型),或者通过某种自动类型转换。
Abstract types are intended to be very much like the built-in types provided by a programming language. The user of a built-in type, such as integer or integer array, is only concerned with creating objects of that type and then performing operations on them. He is not (usually) concerned with how the data objects are represented, and he views the operations on the objects as indivisible and atomic when in fact several machine instructions may be required to perform them. In addition, he is not (in general) permitted to decompose the objects. Consider, for example, the built-in type integer. A programmer wants to declare objects of type integer and to perform the usual arithmetic operations on them. He is usually not interested in an integer object as a bit string, and cannot make use of the format of the bits within a computer word. Also, he would like the language to protect him from foolish misuses of types (e.g., adding an integer to a character) either by treating such a thing as an error (strong typing), or by some sort of automatic type conversion.
对于内置数据类型,程序员正在利用以较低细节级别实现的概念或抽象——编程语言本身及其编译器。类似地,抽象数据类型在一个级别上使用并在较低级别上实现,但较低级别不会通过成为语言的一部分而自动存在,而是通过编写一种特殊的程序来实现,称为操作簇,或简称簇,它根据可以对其执行的操作来定义类型。该语言通过允许使用抽象数据类型而不需要其现场定义来促进此活动。语言处理器通过在类型的使用与其定义(可以提前或稍后提供)之间建立链接来支持抽象数据类型,并通过强制将数据类型视为等效于非常强的一组操作。数据类型的形式。
In the case of a built-in data type, the programmer is making use of a concept or abstraction which is realized at a lower level of detail—the programming language itself and its compiler. Similarly, an abstract data type is used at one level and realized at a lower level, but the lower level does not come into existence automatically by being part of the language, instead, an abstract data type is realized by writing a special kind of program, called an operation cluster, or cluster for short, which defines the type in terms of the operations which can be performed on it. The language facilitates this activity by allowing the use of an abstract data type without requiring its on-the-spot definition. The language processor supports abstract data types by building links between the use of a type and its definition (which may be provided either earlier or later), and by enforcing the view of a data type as equivalent to a set of operations by a very strong form of data typing.
我们观察到,抽象数据类型概念的一个结果是,程序中的大多数抽象操作都属于表征抽象类型的操作集。我们将使用术语“函数抽象”来表示那些不属于任何特征集的抽象操作。功能抽象将被实现为一种或多种数据类型的特征化操作的组合,并且将由过程以通常的方式支持。正弦函数可能是这种函数抽象的一个例子。这正弦例程的实现可以是用real类型的表征运算表示的泰勒级数展开式。
We observe that a consequence of the concept of abstract data types is that most of the abstract operations in a program will belong to the sets of operations characterizing abstract types. We will use the term functional abstraction to denote those abstract operations which do not belong to any characterizing set. A functional abstraction will be implemented as a composition of the characterizing operations of one or more data types, and will be supported in the usual way by a procedure. A sine routine might be an example of such a functional abstraction. The implementation of the sine routine could be a Taylor series expansion expressed in terms of characterizing operations of the type real.
我们现在给出一种允许使用和定义抽象数据类型的编程语言的非正式描述。这种语言是麻省理工学院正在开发的结构化编程语言的简化版本。它主要源自 P ASCAL(Wirth,1971),在许多方面都是传统的,但它在几个重要方面与传统语言不同。
We now give an informal description of a programming language which permits the use and definition of abstract data types. This language is a simplified version of a structured programming language that is under development at M.I.T. It is derived primarily from PASCAL (Wirth, 1971) and is conventional in many respects, but it differs from conventional languages in several important ways.
该语言提供了两种形式的模块,对应于两种抽象形式:支持功能抽象的过程和支持抽象数据类型的操作集群。每个模块都是自行翻译(编译)的。
The language provides two forms of modules corresponding to the two forms of abstraction: procedures, which support functional abstractions, and operation clusters, which support abstract data types. Each module is translated (compiled) by itself.
该语言没有传统意义上的自由变量。在模块内,唯一自由且因此在外部定义的名称是其他模块的名称;即集群名称和过程名称。这些名称在翻译时通过程序员专门为此目的创建的模块名称目录进行绑定。翻译后的模块中没有任何名称需要绑定。
The language has no free variables in the conventional sense. Within a module, the only names that are free, and therefore are defined externally, are the names of other modules; that is, cluster names and procedure names. These names are bound at translation time by means of a directory of module names created by the programmer expressly for this purpose. No names remain to be bound in the translated module.
该语言仅具有结构化控制。没有goto或标签,而只是串联、选择(if、case)和迭代(while)结构的变体。结构化错误处理机制正在开发中。本文仅以保留字错误的存在来表示。
The language has only structured control. There are no goto’s or labels, but merely variants of concatenation, selection (if, case) and iteration (while) constructions. A structured error-handling mechanism is under development. In this paper, it is represented only by the presence of the reserved word error.
语言允许使用和定义抽象数据类型的方式可以最好地通过示例来说明。我们选择了以下问题:编写一个程序 Polish_gen,它将从中缀语言翻译为波兰语后缀语言。Polish_gen 是一个通用程序,它不对输入或输出设备(或文件)做出任何假设。它仅对输入语言做出以下假设:
The way in which the language permits the use and definition of abstract data types can best be illustrated by an example. We have chosen the following problem: Write a program, Polish_gen, which will translate from an infix language to a Polish post-fix language. Polish_gen is to be a general-purpose program which makes no assumptions about input or output devices (or files). It makes only the following assumptions about the input language:
1. 输入语言具有运算符优先语法。
1. The input language has an operator precedence grammar.
2. 输入语言的符号可以是任意字母和数字字符串,也可以是单个非字母数字字符;空格终止符号,但会被忽略。
2. A symbol of the input language is either an arbitrary string of letters and numbers, or a single, non-alphanumeric character; blanks terminate symbols but are otherwise ignored.
例如,如果 Polish_gen 接收字符串a + b * ( c + d ) 作为输入,它将生成字符串abcd + * + 作为输出。我们选择这个问题作为例子,因为这个问题及其解决方案对于对编程语言感兴趣的人来说是熟悉的,并且这个问题足够复杂以说明许多抽象的使用。
For example, if Polish_gen received the string a + b * (c + d) as input, it would produce the string a b c d + * + as output. We have chosen this problem as our example because the problem and its solution are familiar to people interested in programming languages, and the problem is sufficiently complex to illustrate the use of many abstractions.
如图 39.1所示,过程 Polish_gen执行上述转换。它需要三个参数: input,一个抽象类型 infile 的对象,它保存输入语言的句子;输出,抽象类型 outfile 的对象,它将接受输出的句子语言; g,抽象类型语法对象,可用于识别输入语言的符号并确定其优先关系。此外, Polish_gen 使用抽象类型 stack 和 token 的局部变量。请注意,所有数据类型名称在 Polish_gen 中都是免费的,“scan”也是如此,它命名了 Polish_gen 使用的单一功能抽象。
The procedure Polish_gen, shown in Figure 39.1, performs the translation described above. It takes three arguments: input, an object of abstract type infile which holds the sentence of the input language; output, an object of abstract type outfile which will accept a sentence of the output language; and g, an object of abstract type grammar which can be used to recognize symbols of the input language and determine their precedence relations. In addition, Polish_gen makes use of local variables of abstract types stack and token. Note that all the data-type-names appear free in Polish_gen, as does “scan,” which names the single functional abstraction used by Polish_gen.
该语言使用与声明基本类型变量相同的语法来声明抽象数据类型的变量。语法区分涉及创建对象的声明和不涉及创建对象的声明。例如,
The language uses the same syntax to declare variables of abstract data type as to declare variables of primitive type. The syntax distinguishes between declarations which involve the creation of an object and those which do not. For example,
t:令牌
t: token
声明 t 是一个变量的名称,该变量保存抽象类型 token 的对象,但不会创建 token 对象,因此 t 的值最初是未定义的。因此,变量 t 的声明方式与 Mustscan 中的方式相同
states that t is the name of a variable which holds an object of abstract type token, but that no token object is to be created, so that the value of t is initially undefined. Thus the variable t is being declared in the same way as mustscan in
必须扫描:布尔值
mustscan: boolean
类型名称后面的括号表示对象的创建。例如,
The presence of parentheses following the type name signals creation of an object. For example,
s:堆栈(令牌)
s: stack(token)
声明 s 是一个变量的名称,该变量保存一个抽象类型 stack 的对象,并且要在 s 中创建并存储一个 stack 对象。创建对象所需的信息通过参数列表传递;在该示例中,唯一的参数 token 定义了可以放置在堆栈 s 上的元素的类型。堆栈的声明类似于数组声明,例如“字符数组[1..10] ”,因为它们都需要指定元素的类型。
states that s is the name of a variable which holds an object of abstract type stack, and a stack object is to be created and stored in s. Information required for creating the object is passed in a parameter list; in the example, the only parameter, token, defines the type of element which may be placed on the stack s. The declaration of a stack is similar to an array declaration, such as “array[1..10] of characters,” in that they both require the type of elements to be specified.
该语言是强类型的;因此,抽象对象只有三种使用方式:
The language is strongly typed; thus there are only three ways in which an abstract object can be used:
1. 抽象对象可以通过定义其抽象类型的操作进行操作。
1. An abstract object may be operated upon by the operations which define its abstract type.
2. 抽象对象可以作为参数传递给过程。在这种情况下,调用过程传递的实际参数的类型必须与被调用过程中相应的形式参数的类型相同。
2. An abstract object may be passed as a parameter to a procedure. In this case, the type of the actual argument passed by the calling procedure must be identical to the type of the corresponding formal parameter in the called procedure.
3 抽象对象可以分配给变量,但前提是该变量被声明为保存该类型的对象。
3 An abstract object may be assigned to a variable, but only if the variable is declared to hold objects of that type.
对抽象对象的定义操作的应用由使用复合名称的操作调用来指示:例如,
Application of a defining operation to an abstract object is indicated by an operation call in which a compound name is used: for example,
语法$eof(g)
grammar$eof(g)
堆栈 $push(s, t)
stack$push(s, t)
令牌$is_op(t)
token$is_op(t)
复合名称的第一部分标识操作所属的抽象类型,而第二部分标识操作。一个操作调用总是至少有一个参数——该操作所属的抽象类型的对象。
The first part of the compound name identifies the abstract type to which the operation belongs while the second component identifies the operation. An operation call will always have at least one parameter—an object of the abstract type to which the operation belongs.
类型名称包含在操作调用中的原因有多种。首先,由于操作调用可能具有不同抽象类型的多个参数,因此类型名称的缺失可能会导致实际操作哪个对象的歧义。其次,复合名称的使用允许不同的数据类型使用相同的名称进行操作,而不会产生任何标识符冲突。第三,我们相信,一旦读者习惯了这种符号,类型名称前缀将增强程序的可理解性。不仅操作的类型立即显而易见,而且操作调用与过程调用也清楚地区分开来。
There are several reasons why the type-name is included in the operation call. First, since an operation call may have several parameters of different abstract types, the absence of the type-name may lead to an ambiguity as to which object is actually being operated on. Second, use of the compound name permits different data types to use the same names for operations without any clash of identifiers arising. Third, we believe that the type-name prefix will enhance the understandability of programs, once the reader is used to the notation. Not only is the type of the operation immediately apparent, but operation calls are clearly distinguished from procedure calls.
该声明
The statement
t:= 扫描(输入, g)
t:= scan(input, g)
说明了将抽象对象作为参数传递以及将抽象对象分配给变量。如图 39.2所示,过程 scan需要 infile 和 Grammar 类型的对象作为其参数,并返回 token 类型的对象,然后将其存储在 token 变量 r 中。
illustrates both passing abstract objects as parameters, and assigning an abstract object to a variable. The procedure scan, shown in Figure 39.2, expects objects of type infile and grammar as its arguments, and returns an object of type token, which is then stored in the token variable r.
我们已经解释过可以结合变量声明来创建对象。也可以独立于变量声明来创建对象。对象的创建是通过类型名后跟括号来指定的(无论是否在声明内)。例如,在扫描的最后一行
We have explained that objects can be created in conjunction with variable declaration. It is also possible for objects to be created independently of variable declaration. Object creation is specified (whether inside a declaration or not) by the appearance of the typename followed by parentheses. For example, in the last line of scan
令牌(g,newsymb)
token(g, newsymb)
声明要创建一个代表刚刚扫描的符号的令牌对象;创建对象所需的信息(刚刚扫描的语法和符号)在参数列表中传递。
states that a token object, representing the symbol just scanned, is to be created; the information required to create the object (the grammar and the symbol just scanned) is passed in a parameter list.
现在可以给出 Polish_gen 逻辑的简要描述。Polish_gen 使用函数抽象扫描从输入字符串中获取语法符号。Scan 以标记的形式返回符号——引入这种类型是为了提供高效的执行,而不泄露有关语法如何表示符号的信息。
A brief description of the logic of Polish_gen can now be given. Polish_gen uses the functional abstraction scan to obtain a symbol of the grammar from the input string. Scan returns the symbol in the form of a token—a type introduced to provide efficient execution without revealing information about how the grammar represents symbols.
Polish_gen 将包含新扫描符号的标记存储在变量 t 中。如果 t 持有代表标识符(如“a”)而不是运算符(如“+”)的标记,则该标识符将立即放入输出文件中。否则,将栈顶的令牌与 t 进行比较确定它们之间的优先关系。如果关系是“ < ”,则t被压入堆栈(例如,“+” < “*”)。如果关系是“=”,则t和栈顶标记都被丢弃(例如,“(”=“)”)。如果关系是“>”,则栈顶令牌中保存的运算符将附加到输出文件,从而公开新的栈顶令牌。由于该运算符标记可能具有比 t 更高的优先级,因此布尔变量 Mustscan 用于防止扫描新符号并确保下一次与 t 的当前值进行比较。由于文件结尾符号 (grammar$eof(g)) 的语法相关表示最初被推入堆栈,因此堆栈将变空,导致 Polish_gen 仅在通过耗尽输入生成令牌匹配时完成。(我们做了简化的假设,即输入是中缀语言的合法句子。)
Polish_gen stores the token containing the newly scanned symbol in variable t. If t holds a token representing an identifier (like “a”) rather than an operator (like “+”), that identifier is put in the output file immediately. Otherwise, the token on top of the stack is compared with t to determine the precedence relation between them. If the relation is “ <”, t is pushed on the stack (e.g., “+” < “*”). If the relation is “ =”, both t and the top-of-stack token are discarded (e.g., “(” = “)”). If the relation is “ >”, the operator held in the top-of-stack token is appended to the output file, exposing a new top-of-stack token. Since that operator token may have a higher precedence than t, the boolean variable mustscan is used to prevent a new symbol from being scanned and to insure the next comparison is with the current value of t. Because a grammar-dependent representation of the end of file symbol (grammar$eof(g)) is initially pushed onto the stack, the stack will become empty causing Polish_gen to complete only when a matching of token is generated by exhausting the input. (We have made the simplifying assumption that the input is a legitimate sentence of the infix language.)
扫描过程通过定义抽象类型 infile 的操作从输入文件中获取字符。它使用数据类型char和string以及对这些类型的对象的操作。尽管这些类型显示为内置类型,但它们很可能是抽象类型。例如,在这种情况下,内置谓词alphanumeric将表示为 char$alphanumeric。只是语法会改变;在这两种情况下,类型的含义和使用都是相同的。
The scan procedure obtains characters from the input file via the operations defining the abstract type infile. It makes use of the data types char and string, and operations on objects of these types. Although these types are shown as built-in, they could easily have been abstract types instead. In that case, the built-in predicate alphanumeric, for example, would have been expressed as char$alphanumeric. Only the syntax would change; the meaning and use of the types would be the same in either case.
总而言之,Polish_gen 使用了五种数据抽象:infile、outfile、grammar、token 和 stack,以及一种功能抽象:scan。infile 和 outfile 类型说明了数据抽象的强大功能,它们分别用于保护 Polish_gen 免受有关其输入和输出的任何物理事实的影响。当 I/O 实际发生时, Polish_gen 不知道正在使用什么输入和输出设备,也不知道字符在设备上是如何表示的。它所知道的足以满足其需要:对于参数输出,它知道如何添加字符串 (outfile$out_str) 以及如何表示输出已完成 (outfile$close)。对于参数输入,它知道如何获取下一个字符(infile$get),如何查看下一个字符而不将其从输入中删除(infile$peek),以及如何识别输入的结尾(infile$eof)。(请注意,为了使扫描正确运行,在到达文件末尾后,infile 必须在对 infile$get 或 infile$peek 的任何调用中提供非空白、非字母数字字符。)在每种情况下,它的知识都包含提供这些服务的操作的名称。
To sum up, Polish_gen makes use of five data abstractions, infile, outfile, grammar, token and stack, plus one functional abstraction, scan. The power of the data abstractions is illustrated by the types infile and outfile, which are used to shield Polish_gen from any physical facts concerning its input and output, respectively. Polish_gen does not know what input and output devices are being used, when the I/O actually takes place, nor does it know how characters are represented on the devices. What it does know is just enough for its needs: For parameter output it knows how to add a string of characters (outfile$out_str) and how to signify that the output is complete (outfile$close). For parameter input, it knows how to obtain the next character (infile$get), how to look at the next character without removing it from input (infile$peek), and how to recognize the end of input (infile$eof). (Note that for scan to operate correctly, infile must provide a non-blank, non-alphanumeric character on any call on infile$get or infile$peek after the end of file has been reached.) In every case its knowledge consists of the names of the operations which provide these services.
在本节中,我们描述编程对象——操作集群——其翻译提供了类型的实现。该簇包含实现每个特征操作的代码,从而体现了数据类型由一组操作定义的思想。
In this section, we describe the programming object—the operation cluster—whose translation provides an implementation of a type. The cluster contains code implementing each of the characterizing operations and thereby embodies the idea that a data type is defined by a set of operations.
作为示例,请考虑 Polish_gen 使用的抽象数据类型堆栈。支持堆栈的集群如图39.3所示。该簇实现了一种非常通用的堆栈对象,其中堆栈元素的类型事先未知。簇参数 element_type 指示特定堆栈对象要包含的元素的类型。集群定义的第一部分提供了集群向用户呈现的界面的非常简短的描述。集群接口定义了集群的名称、创建集群实例所需的参数(集群实现的抽象类型的对象)以及定义集群实现的类型的操作列表,例如:
As an example, consider the abstract data type stack used by Polish_gen. A cluster supporting stacks is shown in Figure 39.3. This cluster implements a very general kind of stack object in which the type of the stack elements is not known in advance. The cluster parameter element_type indicates the type of element a particular stack object is to contain. The first part of a cluster definition provides a very brief description of the interface which the cluster presents to its users. The cluster interface defines the name of the cluster, the parameters required to create an instance of the cluster (an object of the abstract type which the cluster implements), and a list of the operations defining the type which the cluster implements, e.g.,
堆栈:簇(元素类型:类型)
stack: cluster(element_type: type)
是push、pop、top、erasetop、empty
is push, pop, top, erasetop, empty
保留字的使用强调了数据类型以一组操作为特征的想法。
The use of the reserved word is underlines the idea of a data type being characterized by a group of operations.
集群定义的其余部分描述了如何实际支持抽象类型,包含三个部分:对象表示、创建对象的代码和操作定义。
The remainder of the cluster definition, describing how the abstract type is actually supported, contains three parts: the object representation, the code to create objects and the operation definitions.
代表[( ⟨代表参数⟩ )] = ⟨类型定义⟩
rep[(⟨rep-parameters⟩)] = ⟨type-definition⟩
定义了一个新类型,用保留字rep表示,它只能在集群内访问,并描述如何在集群中查看对象。⟨类型定义⟩定义了一个模板,允许构建和分解该类型的对象。一般来说,它将利用该语言提供的数据结构方法:数组(可能无界)或 P ASCAL记录。可选的 (“[ ]”) ⟨代表参数⟩可以延迟指定⟨类型定义⟩的某些方面,直到创建代表实例。考虑堆栈集群的代表描述:
defines a new type, denoted by the reserved word rep, which is accessible only within the cluster and describes how objects are viewed there. The ⟨type-definition⟩ defines a template which permits objects of that type to be built and decomposed. In general, it will make use of the data structuring methods provided by the language: arrays (possibly unbounded) or PASCAL records. The optional (“[ ]”) ⟨rep-parameters⟩ make it possible to delay specifying some aspects of the ⟨type definition⟩ until an instance of the rep is created. Consider the rep description of the stack cluster:
rep(type_param:类型) = (tp:整数; e_type:类型; stk: type_param的数组[1..] )
rep(type_param: type) = (tp: integer; e_type: type; stk: array[1..] of type_param)
⟨类型定义⟩指定堆栈对象由包含三个名为 tp、stk 和 e_type 的组件的记录表示。参数 type_param 指定可以存储在名为 stk 的无界数组中的元素类型,该数组将保存推入堆栈对象的元素。同样的类型也将存储在 e_type 组件中,并用于 type如下所述进行检查。tp 组件保存堆栈最顶层元素的索引。
The ⟨type-definition⟩ specifies that a stack object is represented by a record containing three components named tp, stk, and e_type. The parameter, type_param, specifies the type of element which may be stored in the unbounded array named stk which will hold the elements pushed onto a stack object. This same type will also be stored in the e_type component, and is used for type checking as will be described below. The tp component holds the index of the topmost element of the stack.
s:堆栈(令牌)
s: stack(token)
(在执行时)发生的一件事是调用创建代码,导致执行该过程主体。集群的参数实际上是创建代码的参数。由于不提供自由变量(除了对外部定义模块的引用之外),因此操作或rep中的⟨类型定义⟩都无法访问这些参数。因此,有关要保存的参数的任何信息都必须显式插入到rep的每个实例中。
one thing that happens (at execution time) is a call on the create-code, causing that procedure body to be executed. The parameters of the cluster are actually parameters of the create-code. Since free variables, other than references to externally defined modules, are not provided, these parameters are not accessible either to the operations or to the ⟨type definition⟩ in the rep. Therefore, any information about the parameters that is to be saved must be explicitly inserted into each instance of the rep.
堆栈集群中显示的代码是典型的创建代码。首先,创建一个rep类型的对象;也就是说,分配空间来保存由rep定义的对象。然后,一些初始值被存储在对象中。最后,该对象被返回给调用者。当对象返回时,其类型从rep类型更改为簇定义的抽象类型。
The code shown in the stack cluster is typical of create-code. First, an object of type rep is created; that is, space is allocated to hold the object as defined by the rep. Then, some initial values are stored in the object. Finally, the object is returned to the caller. When the object is returned, its type is changed from type rep to the abstract type defined by the cluster.
操作始终至少有一个代表类型的参数。由于集群可能同时支持其定义类型的许多对象,因此该参数告诉操作要操作的特定对象。请注意,当该参数在调用者和操作之间传递时,其类型将从抽象类型更改为rep类型。
Operations always have at least one parameter—of type rep. Because the cluster may simultaneously support many objects of its defined type, this parameter tells the operation the particular object on which to operate. Note that the type of this parameter will change from the abstract type to type rep as it is passed between the caller and the operation.
由于该语言是强类型的,因此必须检查推入给定堆栈的对象类型与堆栈可以保存的元素类型的一致性。这种一致性要求是通过声明push的第二个参数的类型与作为push的第一个参数的堆栈对象的rep的e_type组件相同来在语法上指定的。翻译器可以生成代码来验证类型在运行时是否匹配,如果不匹配则引发错误。
Because the language is strongly typed, the type of objects pushed on a given stack must be checked for consistency with the type of elements the stack can hold. This consistency requirement is specified syntactically by declaring that the type of the second argument of push is to be the same as the e_type component of the rep of the stack object which is the first argument of push. The translator can generate code to verify that the types match at run time and to raise an error if they don’t.
抽象数据类型的引入是为了让程序员在使用数据抽象时不必担心不相关的细节。但事实上我们已经走得更远了。由于该语言是强类型的,因此用户无法使用任何实现细节。在本节中,我们讨论这种限制带来的好处:产生的程序更加模块化,并且更容易理解、修改、维护和证明正确。
Abstract data types were introduced as a way of freeing a programmer from concern about irrelevant details in his use of data abstractions. But in fact we have gone further than that. Because the language is strongly typed, the user is unable to make use of any implementation details. In this section we discuss the benefits that accrue from this limitation: the programs which result are more modular, and easier to understand, modify, maintain and prove correct.
令牌是为控制对实现细节的访问而创建的类型的一个很好的示例。不必引入新类型,Polish_gen 可以被编写为接受来自扫描的字符串,将字符串存储在堆栈上,并比较字符串以确定优先级关系(通过适当的操作语法$prec_rel)。这样的解决方案效率很低。由于优先级矩阵可以通过语法保留字表中运算符的位置来索引,因此有效的实现只需查找字符串一次即可确定它是否是运算符符号,如果是,则使用索引Polish_gen 中的运算符。
Token is a good example of a type created to control access to implementation details. Instead of introducing a new type, Polish_gen could have been written to accept strings from scan, to store strings on the stack, and to compare strings to determine the precedence relation (via an appropriate operation grammar$prec_rel). Such a solution would be inefficient. Since the precedence matrix can be indexed by the positions of the operators in the reserved word table of the grammar, an efficient implementation would look up the character string only once to find out if it is an operator symbol and, if so, use the index of the operator in Polish_gen.
然而,这暴露了有关语法表示的信息。如果 Polish_gen 或其他使用语法的模块使用此信息,则语法簇的正常维护和修改可能会引入难以追踪的错误(Parnas,1971)。因此,引入了新类型 token 来限制有关语法如何表示的信息的分发。现在,语法簇的重新定义只能影响标记簇,它不会对其从语法接收的索引做出任何假设。如果在查找优先关系时发生错误(例如索引越界),则该错误只能是由标记或语法簇中的某些内容引起的。
This, however, exposes information about the representation of the grammar. If Polish_gen or some other module which uses the grammar makes use of this information, normal maintenance and modification of the grammar cluster can introduce errors which are difficult to track down (Parnas, 1971). Therefore, the new type, token, is introduced to limit the distribution of information about how the grammar is represented. Now a redefinition of the grammar cluster can affect only the token cluster—which makes no assumptions about the index it receives from grammar. If an error occurs while looking up a precedence relation (like an index out of bounds), the error can only have been caused by something in the token or grammar cluster.
实际上,令牌实现的选择(例如,令牌是用整数还是字符串表示)涉及设计决策。这个决定可以延迟到定义了标记的簇为止,并且不需要在 Polish_gen 的编码期间做出。因此,Polish_gen 的编程可以根据 Dijkstra 的编程原则之一来完成:一次构建一个决定(Dijkstra,1972)。遵循这一原则可以简化 Polish_gen 的逻辑,使其更易于理解和维护。
Actually, the selection of an implementation of tokens—for example, whether a token is represented by an integer or a character string—involves a design decision. This decision can be delayed until the cluster for tokens is defined and need not be made during the coding of Polish_gen. Therefore, the programming of Polish_gen can be done according to one of Dijkstra’s programming principles: build the program one decision at a time (Dijkstra, 1972). Following this principle leads to a simplified logic for Polish_gen, making it easier to understand and maintain.
使表示不可访问还可以使程序更容易被证明是正确的。程序的证明分为两部分:证明集群正确实现该类型,并证明使用该类型的程序是正确的。仅在前一个证明中需要考虑类型对象的实现细节;后一个证明仅基于类型的抽象属性,这些属性可以用每种类型的特征化操作之间的关系来表达。
Making the representation inaccessible also results in a program which is easier to prove correct. The proof of a program is divided into two parts: a proof that the cluster correctly implements the type, and a proof that the program using the type is correct. Only in the former proof need details of the implementation of type objects be considered; the latter proof is based only on the abstract properties of the types, which may be expressed in terms of relations among the characterizing operations for each type.
……
…
经计算机协会许可,转载自 Liskov 和 Zilles (1974)。
Reprinted from Liskov and Zilles (1974), with permission from the Association for Computing Machinery.
随着计算机系统变得越来越大、越来越复杂,一个令人不快的事实变得显而易见。软件很难写,更难判断要写多久。为工程项目管理而开发的协议似乎不适用于软件。计算机科学家 Tom Cheatham 曾经说过:“土木工程和软件工程之间的区别在于,当有人告诉你一座桥已经建成了一半时,你可以走到桥上看看。” (参见罗伊斯的 90% 完成综合症,第 326 页。)
As computer systems grew larger and more complicated, an unhappy fact became evident. Software was hard to write, and it was even harder to judge how long it would take to write. The protocols that had been developed for the management of engineering projects didn’t seem applicable to software. Computer scientist Tom Cheatham used to say, “The difference between civil engineering and software engineering is that when someone tells you that a bridge is half built, you can walk out onto it and see.” (Cf. Royce’s 90%-finished syndrome, page 326.)
Frederick C. “Fred” Brooks(生于 1931 年)是哈佛大学霍华德·艾肯 (Howard Aiken) 的博士生,他的博士论文 (Brooks, 1956) 分析了商业数据处理中的问题。他于 1956 年加入 IBM,当时该公司正在生产一系列功能日益强大的计算机,但每一款都与其前身不兼容。他认识到(正如霍珀所预测的那样)如果客户必须重写程序以升级到较新的机器,软件成本将是不可持续的,因此他设计了 System/360 系列,以便可以将在该系列中的一台机器上运行的软件轻松移植到新机器上。运行在更新、更强大的模型上,甚至是具有不同底层硬件实现的模型(微编程有所帮助;请参见第 165 页)。布鲁克斯创造了“计算机体系结构”一词来指代软件所看到的计算机系统的结构。
Frederick C. “Fred” Brooks (b. 1931) was a PhD student of Howard Aiken at Harvard and wrote his PhD thesis (Brooks, 1956) analyzing a problem in business data processing. He joined IBM in 1956 as the company was producing a series of increasingly powerful computers, each incompatible with its predecessor. Recognizing (as Hopper had predicted) that software costs would be unsustainable if customers had to rewrite their programs to move up to newer machines, he designed the System/360 line so that software that ran on one machine in the line could easily be ported to run on a newer and more powerful model, even one with a different underlying hardware implementation (microprogramming helped; see page 165). Brooks coined the term “computer architecture” to refer to the structure of a computer system as the software saw it.
1964 年离开 IBM 时,Thomas Watson Jr. 问他为什么软件项目如此难以管理。“神话般的人月”是布鲁克斯答案的一部分。这篇文章是同名卷中的几篇文章之一;所有这些都值得一读,但这本书已经获得了标志性的地位。值得注意的是,听到为项目增加劳动力以加快完成速度的建议并不罕见,既没有考虑到新工人达到所需知识水平的时间,也没有考虑到协调活动的问题的贡献者数量较多。
When he left IBM in 1964, Thomas Watson Jr. asked him why software projects are so hard to manage. “The Mythical Man-Month” is part of Brooks’s answer. This essay is one of several in a volume of the same name; all are worth reading, but this one has acquired iconic status. Remarkably, it is still not unusual to hear proposals to add labor to a project in order to speed up its completion, taking into account neither the time for the new workers to come up to the needed knowledge level, nor the problems of coordinating the activities of a larger number of contributors.
布鲁克斯从 IBM 离职,在北卡罗来纳大学创建了计算机科学系,并在那里度过了余下的职业生涯。他的团队为计算机图形学做出了重大贡献,但他最持久的遗产是他在软件工程方面的智慧。(“所有程序员都是乐观主义者”。)十年后,他发表了另一篇关于软件工程“焦油坑”的尖锐分析,“没有银弹:软件工程中的本质和意外”(Brooks,1987),他在其中寻求解释为什么——尽管付出了巨大的努力和宏伟的承诺——为加速软件生产过程而开发的技术似乎都没有使它变得更简单和更快,或者极大地改进了软件生产过程。结果的质量。近年来,模块化、重用和开源库有所帮助,但绝不会使本文的谦逊智慧失效。
Brooks went from IBM to found the computer science department at the University of North Carolina, where he spent the rest of his career. His group has made significant contributions to computer graphics, but his most enduring legacy is his wisdom on software engineering. (“All programmers are optimists” still.) A decade later he published another trenchant analysis of the “tar pit” of software engineering, “No Silver Bullet: Essence and Accident in Software Engineering” (Brooks, 1987), in which he sought to explain why—despite intense effort and grandiose promises—none of the technologies that had been developed to speed the software production process seemed to have made it significantly simpler and faster, or to have greatly improved the quality of the result. Modularization, re-use, and open source libraries have helped in recent years, but have by no means invalidated the humble wisdom of this essay.
[编辑:菜单顶部的文字说:“美味佳肴需要时间。如果我们让您等待,是为了更好地为您服务并取悦您。”]
[EDITOR: The text at the top of the menu says, “Good cuisine takes time. If we make you wait, it’s to serve you better and please you.”]
由于缺乏日历时间而出错的软件项目比所有其他原因出错的总和还多。为什么这种灾难原因如此普遍?
MORE software projects have gone awry for lack of calendar time than for all other causes combined. Why is this cause of disaster so common?
首先,我们的估算技术还很不发达。更严重的是,它们反映了一种不言而喻的假设,这种假设是完全不真实的,即一切都会顺利。
First, our techniques of estimating are poorly developed. More seriously, they reflect an unvoiced assumption which is quite untrue, i.e., that all will go well.
其次,我们的估算技术错误地将努力与进展混为一谈,隐藏了“人”和“月”可以互换的假设。
Second, our estimating techniques fallaciously confuse effort with progress, hiding the assumption that men and months are interchangeable.
第三,由于我们对自己的估计不确定,软件经理往往缺乏安东尼厨师那样彬彬有礼的固执。
Third, because we are uncertain of our estimates, software managers often lack the courteous stubbornness of Antoine’s chef.
第四,进度进度监控不力。在其他工程学科中经过验证和常规的技术被认为是软件工程中的根本性创新。
Fourth, schedule progress is poorly monitored. Techniques proven and routine in other engineering disciplines are considered radical innovations in software engineering.
第五,当发现进度延误时,自然(也是传统)的反应是增加人力。就像用汽油扑灭大火一样,这会让事情变得更糟,更糟。更多的火灾需要更多的汽油,从而开始一个以灾难告终的再生循环。
Fifth, when schedule slippage is recognized, the natural (and traditional) response is to add manpower. Like dousing a fire with gasoline, this makes matters worse, much worse. More fire requires more gasoline, and thus begins a regenerative cycle which ends in disaster.
时间表监控将是另一篇文章的主题。让我们更详细地考虑问题的其他方面。
Schedule monitoring will be the subject of a separate essay. Let us consider other aspects of the problem in more detail.
所有程序员都是乐观主义者。也许这种现代巫术特别吸引那些相信幸福结局和仙女教母的人。也许数百个细节上的挫折会驱散所有人,除了那些习惯性关注最终目标的人。也许只是计算机年轻了,程序员年轻了,年轻人总是乐观主义者。但无论选择过程如何进行,结果都是无可争议的:“这次肯定会运行”,或者“我刚刚发现了最后一个错误”。
All programmers are optimists. Perhaps this modern sorcery especially attracts those who believe in happy endings and fairy godmothers. Perhaps the hundreds of nitty frustrations drive away all but those who habitually focus on the end goal. Perhaps it is merely that computers are young, programmers are younger, and the young are always optimists. But however the selection process works, the result is indisputable: “This time it will surely run,” or “I just found the last bug.”
因此,系统编程调度的第一个错误假设是一切都会顺利,即每个任务只需要它“应该”花费的时间。
So the first false assumption that underlies the scheduling of systems programming is that all will go well, i.e., that each task will take only as long as it “ought” to take.
程序员中普遍存在的乐观情绪不值得简单的分析。多萝西·塞耶斯(Dorothy Sayers)在她的优秀著作《创客的思想》中将创意活动分为三个阶段:想法、实施和互动。那么,一本书、一台计算机或一个程序首先作为一种理想的构造而存在,它建立在时间和空间之外,但在作者的头脑中是完整的。它是在时间和空间上,通过笔、墨水和纸,或者通过电线、硅和铁氧体来实现的。当有人阅读书籍、使用计算机或运行程序,从而与创作者的思想进行互动时,创作就完成了。
The pervasiveness of optimism among programmers deserves more than a flip analysis. Dorothy Sayers, in her excellent book, The Mind of the Maker, divides creative activity into three stages: the idea, the implementation, and the interaction. A book, then, or a computer, or a program comes into existence first as an ideal construct, built outside time and space, but complete in the mind of the author. It is realized in time and space, by pen, ink, and paper, or by wire, silicon, and ferrite. The creation is complete when someone reads the book, uses the computer, or runs the program, thereby interacting with the mind of the maker.
塞耶斯小姐用这个描述不仅阐明了人类的创造性活动,而且阐明了基督教的三位一体教义,将有助于我们完成当前的任务。对于人类的创造者来说,我们的想法的不完整性和不一致只有在实施过程中才会变得清晰。因此,写作、实验、“解决”是理论家的基本学科。
This description, which Miss Sayers uses to illuminate not only human creative activity but also the Christian doctrine of the Trinity, will help us in our present task. For the human makers of things, the incompletenesses and inconsistencies of our ideas become clear only during implementation. Thus it is that writing, experimentation, “working out” are essential disciplines for the theoretician.
在许多创造性活动中,执行媒介是很棘手的。木材劈裂;油漆涂抹;电路环。媒介的这些物理限制限制了可能表达的想法,并且它们也在实施中造成了意想不到的困难。
In many creative activities the medium of execution is intractable. Lumber splits; paints smear; electrical circuits ring. These physical limitations of the medium constrain the ideas that may be expressed, and they also create unexpected difficulties in the implementation.
因此,实施需要时间和汗水,这既是由于物理媒体的原因,也是因为基本思想的不足。我们倾向于将大部分实施困难归咎于物理媒体;因为媒体不像思想那样是“我们的”,我们的骄傲影响了我们的判断。
Implementation, then, takes time and sweat both because of the physical media and because of the inadequacies of the underlying ideas. We tend to blame the physical media for most of our implementation difficulties; for the media are not “ours” in the way the ideas are, and our pride colors our judgment.
然而,计算机编程是使用一种非常易于处理的介质进行创作的。程序员从纯粹的思想内容构建:概念及其非常灵活的表示。由于该媒介易于处理,因此我们预计实施过程中不会遇到什么困难;因此我们普遍保持乐观态度。因为我们的想法是错误的,所以我们就有错误;因此,我们的乐观情绪是没有道理的。
Computer programming, however, creates with an exceedingly tractable medium. The programmer builds from pure thought-stuff: concepts and very flexible representations thereof. Because the medium is tractable, we expect few difficulties in implementation; hence our pervasive optimism. Because our ideas are faulty, we have bugs; hence our optimism is unjustified.
在单个任务中,一切顺利的假设会对进度产生概率影响。它确实可能按计划进行,因为将遇到的延迟存在概率分布,而“无延迟”的概率是有限的。然而,大型编程工作由许多任务组成,其中一些任务是端到端链接的。每一项都顺利进行的可能性变得微乎其微。
In a single task, the assumption that all will go well has a probabilistic effect on the schedule. It might indeed go as planned, for there is a probability distribution for the delay that will be encountered, and “no delay” has a finite probability. A large programming effort, however, consists of many tasks, some chained end-to-end. The probability that each will go well becomes vanishingly small.
第二种错误的思维模式体现在估算和调度中使用的工作量单位:人月。成本确实会随着人员数量和月数的乘积而变化。进步却没有。因此,以人月作为衡量工作规模的单位是一个危险且具有欺骗性的神话。这意味着男人和月份是可以互换的。
The second fallacious thought mode is expressed in the very unit of effort used in estimating and scheduling: the man-month. Cost does indeed vary as the product of the number of men and the number of months. Progress does not. Hence the man-month as a unit for measuring the size of a job is a dangerous and deceptive myth. It implies that men and months are interchangeable.
只有当一项任务可以分配给许多工人且他们之间没有沟通时,人和月才是可以互换的商品(图40.1)。收割小麦、采摘棉花都是如此;对于系统编程来说,这甚至不完全正确。
Men and months are interchangeable commodities only when a task can be partitioned among many workers with no communication among them (Figure 40.1). This is true of reaping wheat or picking cotton; it is not even approximately true of systems programming.
图 40.1: 时间与工人数量的关系——完全可划分的任务。
Figure 40.1: Time versus number of workers—perfectly partitionable task.
当由于顺序约束而无法对任务进行分区时,应用更多工作量不会对进度产生影响(图 40.2)。无论分配多少妇女,生孩子都需要九个月的时间。由于调试的顺序性,许多软件任务都具有此特征。在可以划分但需要通信的任务中在子任务中,要完成的工作量必须加上沟通的努力。因此,能做的最好的事情比几个月的平均交易要差一些(图40.3)。
When a task cannot be partitioned because of sequential constraints, the application of more effort has no effect on the schedule (Figure 40.2). The bearing of a child takes nine months, no matter how many women are assigned. Many software tasks have this characteristic because of the sequential nature of debugging. In tasks that can be partitioned but which require communication among the subtasks, the effort of communication must be added to the amount of work to be done. Therefore the best that can be done is somewhat poorer than an even trade of men for months (Figure 40.3).
图 40.2: 时间与工作人员数量 — 不可分区的任务。
Figure 40.2: Time versus number of workers—unpartitionable task.
图 40.3: 时间与工作人员数量 — 需要通信的可分区任务。
Figure 40.3: Time versus number of workers—partitionable task requiring communication.
额外的沟通负担由两部分组成:培训和相互沟通。每个工人都必须接受技术、工作目标、总体策略和工作计划方面的培训。这种培训无法分割,因此这部分增加的工作量随工人数量线性变化。
The added burden of communication is made up of two parts, training and intercommunication. Each worker must be trained in the technology, the goals of the effort, the overall strategy, and the plan of work. This training cannot be partitioned, so this part of the added effort varies linearly with the number of workers.
相互沟通更差。如果任务的每个部分必须分别与其他部分协调,则工作量会增加为n ( n − 1)/2。三个工人需要的成对相互交流是两个工人的三倍;四个需要的量是两个的六倍。而且,如果需要三人、四人等工人开会来共同解决问题,事情就会变得更糟。额外的沟通工作可能会完全抵消原始任务的划分,并将我们带到图40.4的情况。
Intercommunication is worse. If each part of the task must be separately coordinated with each other part, the effort increases as n(n − 1)/2. Three workers require three times as much pairwise intercommunication as two; four require six times as much as two. If, moreover, there need to be conferences among three, four, etc., workers to resolve things jointly, matters get worse yet. The added effort of communicating may fully counteract the division of the original task and bring us to the situation of Figure 40.4.
图 40.4: 时间与工人数量的关系——具有复杂相互关系的任务。
Figure 40.4: Time versus number of workers—task with complex interrelationships.
由于软件构建本质上是一项系统工作(复杂相互关系中的一项练习),因此沟通工作量很大,并且它很快就会主导分区带来的单个任务时间的减少。增加更多的人员会延长而不是缩短时间表。
Since software construction is inherently a systems effort—an exercise in complex interrelationships—communication effort is great, and it quickly dominates the decrease in individual task time brought about by partitioning. Adding more men then lengthens, not shortens, the schedule.
日程表中没有哪个部分像组件调试和系统测试那样受到顺序约束的彻底影响。此外,所需的时间取决于遇到的错误的数量和微妙程度。理论上这个数字应该为零。由于乐观,我们通常预计错误数量会比实际情况少。因此,测试通常是编程中安排最错误的部分。
No parts of the schedule are so thoroughly affected by sequential constraints as component debugging and system test. Furthermore, the time required depends on the number and subtlety of the errors encountered. Theoretically this number should be zero. Because of optimism, we usually expect the number of bugs to be smaller than it turns out to be. Therefore testing is usually the most mis-scheduled part of programming.
多年来,我一直成功地使用以下经验法则来安排软件任务:
For some years I have been successfully using the following rule of thumb for scheduling a software task:
1/3 规划
1/3 planning
1/6编码
1/6 coding
1/4组件测试和早期系统测试
1/4 component test and early system test
1/4系统测试,所有组件都在手。
1/4 system test, all components in hand.
这在几个重要方面与传统调度不同:
This differs from conventional scheduling in several important ways:
1. 用于规划的比例高于正常水平。即便如此,这还不足以产生详细而可靠的规范,也不足以包括对全新技术的研究或探索。
1. The fraction devoted to planning is larger than normal. Even so, it is barely enough to produce a detailed and solid specification, and not enough to include research or exploration of totally new techniques.
2. 用于调试已完成代码的进度的一半比正常情况要多得多。
2. The half of the schedule devoted to debugging of completed code is much larger than normal.
3、容易估计的部分,即编码,只给出了进度表的六分之一。
3. The part that is easy to estimate, i.e., coding, is given only one-sixth of the schedule.
在检查传统安排的项目时,我发现很少有项目允许用预计时间表的一半进行测试,但大多数确实为此目的花费了实际时间表的一半。其中许多在系统测试之前都按计划进行。
In examining conventionally scheduled projects, I have found that few allowed one-half of the projected schedule for testing, but that most did indeed spend half of the actual schedule for that purpose. Many of these were on schedule until and except in system testing.
尤其是,未能留出足够的时间进行系统测试尤其是灾难性的。由于延误是在日程结束时发生的,所以直到临近交货日期时,没有人意识到日程有问题。迟来且毫无预警的坏消息会让客户和经理感到不安。
Failure to allow enough time for system test, in particular, is peculiarly disastrous. Since the delay comes at the end of the schedule, no one is aware of schedule trouble until almost the delivery date. Bad news, late and without warning, is unsettling to customers and to managers.
此外,此时的拖延会造成异常严重的财务和心理影响。该项目人员配备齐全,每日成本最高。更严重的是,软件是为了支持其他业务工作(计算机的运输、新设施的运营等),而延迟这些的二次成本非常高,因为几乎已经到了软件发货的时间。事实上,这些次要成本可能远远超过所有其他成本。因此,在原计划中留出足够的系统测试时间非常重要。
Furthermore, delay at this point has unusually severe financial, as well as psychological, repercussions. The project is fully staffed, and cost-per-day is maximum. More seriously, the software is to support other business effort (shipping of computers, operation of new facilities, etc.) and the secondary costs of delaying these are very high, for it is almost time for software shipment. Indeed, these secondary costs may far outweigh all others. It is therefore very important to allow enough system test time in the original schedule.
观察到,对于程序员来说,就像对于厨师一样,顾客的紧迫性可以控制任务的预定完成,但它不能控制实际的完成情况。承诺两分钟内做一份煎蛋卷,看来进展顺利。但当它在两分钟内还没有凝固时,顾客有两个选择——等待或生吃。软件客户也有同样的选择。厨师还有另一个选择:他可以把暖气调高。结果往往是煎蛋卷无可挽救——一部分烧焦,另一部分生。
Observe that for the programmer, as for the chef, the urgency of the patron may govern the scheduled completion of the task, but it cannot govern the actual completion. An omelette, promised in two minutes, may appear to be progressing nicely. But when it has not set in two minutes, the customer has two choices—wait or eat it raw. Software customers have had the same choices. The cook has another choice; he can turn up the heat. The result is often an omelette nothing can save—burned in one part, raw in another.
现在我不认为软件经理比厨师或其他工程经理拥有更少的内在勇气和坚定性。但是,与顾客期望的日期相匹配的错误安排在我们的学科中比其他工程领域更为常见。对于没有定量方法得出、缺乏数据支持且主要由管理者的直觉证明的估计,要做出有力的、可信的、冒着工作风险的辩护是非常困难的。
Now I do not think software managers have less inherent courage and firmness than chefs, nor than other engineering managers. But false scheduling to match the patron’s desired date is much more common in our discipline than elsewhere in engineering. It is very difficult to make a vigorous, plausible, and job-risking defense of an estimate that is derived by no quantitative method, supported by little data, and certified chiefly by the hunches of the managers.
显然需要两种解决方案。我们需要开发并公布生产力数据、错误发生率数据、估算规则等等。整个行业只能从共享这些数据中获益。
Clearly two solutions are needed. We need to develop and publicize productivity figures, bug-incidence figures, estimating rules, and so on. The whole profession can only profit from sharing such data.
在估算建立在更合理的基础上之前,个别管理者需要坚定自己的决心并捍卫自己的估算,并确保他们的不良预感比基于愿望的估算要好。
Until estimating is on a sounder basis, individual managers will need to stiffen their backbones and defend their estimates with the assurance that their poor hunches are better than wish-derived estimates.
当一个重要的软件项目落后于计划时该怎么办?自然要增加人力。如图40.1到40.4所示,这可能有帮助,也可能没有帮助。
What does one do when an essential software project is behind schedule? Add manpower, naturally. As Figures 40.1 through 40.4 suggest, this may or may not help.
让我们考虑一个例子。假设一项任务预计需要 12 个人月,并分配给三名男子,为期四个月,并且有可衡量的里程碑 A、B、C、D,这些里程碑计划在每个月底落下(图 40.5 )。
Let us consider an example. Suppose a task is estimated at 12 man-months and assigned to three men for four months, and that there are measurable mileposts A, B, C, D, which are scheduled to fall at the end of each month (Figure 40.5).
现在假设两个月后才达到第一个里程碑(图 40.6)。经理面临哪些选择?
Now suppose the first milepost is not reached until two months have elapsed (Figure 40.6). What are the alternatives facing the manager?
1. 假设任务必须按时完成。假设只有任务的第一部分被错误估计,因此图 40.6准确地讲述了这个故事。那么还剩下 9 个人月的工作量,还有两个月,所以需要人手。在指定的 3 名人员中添加 2 名人员。
1. Assume that the task must be done on time. Assume that only the first part of the task was misestimated, so Figure 40.6 tells the story accurately. Then 9 man-months of effort remain, and two months, so men will be needed. Add 2 men to the 3 assigned.
2. 假设任务必须按时完成。假设整个估计值都偏低,那么图 40.7就真实描述了这种情况。那么还剩下 18 个人月的工作量,还有两个月,所以需要 9 个人。在分配的 3 名人员的基础上再增加 6 名人员。
2. Assume that the task must be done on time. Assume that the whole estimate was uniformly low, so that Figure 40.7 really describes the situation. Then 18 man-months of effort remain, and two months, so 9 men will be needed. Add 6 men to the 3 assigned.
3. 重新安排。我喜欢经验丰富的硬件工程师 P. Fagg 给出的建议:“不要犯小错误。” 即在新的日程中留出足够的时间,保证工作能够认真、彻底地完成,而不必再次重新安排。
3. Reschedule. I like the advice given by P. Fagg, an experienced hardware engineer, “Take no small slips.” That is, allow enough time in the new schedule to ensure that the work can be carefully and thoroughly done, and that rescheduling will not have to be done again.
4. 修剪任务。实际上,一旦团队发现进度延误,这种情况就很可能发生。当延误的次要成本非常高时,这是唯一可行的行动。经理唯一的选择是正式而仔细地修剪它,重新安排时间,或者看着任务被仓促的设计和不完整的测试默默地修剪。
4. Trim the task. In practice this tends to happen anyway, once the team observes schedule slippage. Where the secondary costs of delay are very high, this is the only feasible action. The manager’s only alternatives are to trim it formally and carefully, to reschedule, or to watch the task get silently trimmed by hasty design and incomplete testing.
在前两种情况下,坚持在四个月内完成不变的任务是灾难性的。例如,考虑第一种替代方案的再生效应(图 40.7)。这两名新人,无论能力如何,无论招募速度如何,都需要由其中一名经验丰富的人进行这项任务的培训。如果这需要一个月,则需要 3 个人月的时间来工作不在原来的估计之内。此外,原本划分为三个部分的任务必须重新划分为五个部分;因此,一些已经完成的工作将会丢失,并且系统测试必须延长。因此,到第三个月末,还剩下 7 个多月的工作量,并且有 5 个经过培训的人员和 1 个月的时间可供使用。如图40.8所示,产品迟到了,就好像没有添加任何产品一样(图 40.6)。
In the first two cases, insisting that the unaltered task be completed in four months is disastrous. Consider the regenerative effects, for example, for the first alternative (Figure 40.7). The two new men, however competent and however quickly recruited, will require training in the task by one of the experienced men. If this takes a month, 3 man-months will have been devoted to work not in the original estimate. Furthermore, the task, originally partitioned three ways, must be repartitioned into five parts; hence some work already done will be lost, and system testing must be lengthened. So at the end of the third month, substantially more than 7 man-months of effort remain, and 5 trained people and one month are available. As Figure 40.8 suggests, the product is just as late as if no one had been added (Figure 40.6).
如果希望在四个月内完成,仅考虑培训时间而不考虑重新分区和额外的系统测试,则需要在第二个月末增加 4 名人员,而不是 2 名。为了涵盖重新分区和系统测试效果,还必须添加其他人。然而,现在至少有一支7人的球队,而不是3人的球队;因此,团队组织、任务分工等方面不仅是程度不同,而且是性质不同。
To hope to get done in four months, considering only training time and not repartitioning and extra systems test, would require adding 4 men, not 2, at the end of the second month. To cover repartitioning and system test effects, one would have to add still other men. Now, however, one has at least a 7-man team, not a 3-man one; thus such aspects as team organization and task division are different in kind, not merely in degree.
请注意,到了第三个月末,事情看起来非常糟糕。尽管管理层付出了所有努力,但 3 月 1 日的里程碑仍未实现。重复这个循环、增加更多人力的诱惑非常大。这就是疯狂。
Notice that by the end of the third month things look very black. The March 1 milestone has not been reached in spite of all the managerial effort. The temptation is very strong to repeat the cycle, adding yet more manpower. Therein lies madness.
前述假设仅错误估计了第一个里程碑。如果在 3 月 1 日,我们做出保守的假设,即整个计划是乐观的,如图40.7所示,那么我们想在原始任务中添加 6 名人员。训练、重新分区、系统测试效果的计算留给读者作为练习。毫无疑问,与最初的三人未经增强的重新安排相比,再生灾难将产生更差的产品。
The foregoing assumed that only the first milestone was misestimated. If on March 1 one makes the conservative assumption that the whole schedule was optimistic, as Figure 40.7 depicts, one wants to add 6 men just to the original task. Calculation of the training, repartitioning, system testing effects is left as an exercise for the reader. Without a doubt, the regenerative disaster will yield a poorer product, later, than would rescheduling with the original three men, unaugmented.
我们过于简单化地陈述布鲁克斯定律:
Oversimplifying outrageously, we state Brooks’s Law:
向已晚的软件项目添加人力会使项目变得更晚。
Adding manpower to a late software project makes it later.
这就是人月的去神话化。项目的月数取决于其连续的限制。最大人数取决于独立子任务的数量。根据这两个数量,我们可以得出使用更少的人员和更多的月份的时间表。(唯一的风险是产品过时。)然而,我们无法用更多的人员和更少的时间来制定可行的时间表。由于缺乏日历时间而出错的软件项目比所有其他原因出错的总和还要多。
This then is the demythologizing of the man-month. The number of months of a project depends upon its sequential constraints. The maximum number of men depends upon the number of independent subtasks. From these two quantities one can derive schedules using fewer men and more months. (The only risk is product obsolescence.) One cannot, however, get workable schedules using more men and fewer months. More software projects have gone awry for lack of calendar time than for all other causes combined.
经 Pearson Education, Inc. 许可,转载自 Brooks (1995)。
Reprinted from Brooks (1995), with permission from Pearson Education, Inc.
网络很棘手。为了确保通过连接潜在敌对节点的容易发生故障的链路进行完美的消息传递,协议必须预测难以想象的情况。必须以怀疑的态度对待时间依赖性,并且可能依赖于专用硬件的实现可能难以验证。随着计算机变得越来越小,企业开始使用不同制造商的硬件安装本地网络,困难变得更加复杂。
Networking is tricky. To ensure flawless message delivery over failure-prone links connecting potentially hostile nodes, protocols have to anticipate hard-to-imagine circumstances. Time dependencies must be treated skeptically, and the implementation, which may rely on special purpose hardware, may be hard to verify. The difficulties compounded as computers became smaller and businesses started to install local networks with hardware from disparate manufacturers.
罗伯特·“鲍勃”·梅特卡夫(Robert “Bob” Metcalfe,生于 1946 年)于 1969 年在麻省理工学院获得本科学位,然后开始在哈佛大学攻读研究生,当时正是大学首次连接到 ARPA NET 的时候。哈佛拒绝了梅特卡夫负责其 ARPA NET连接的提议,因此他在麻省理工学院接受了类似的工作。虽然这段经历点燃了梅特卡夫对网络的热情,但它却使他与哈佛大学当时骨干的计算机科学系断绝了联系,而且他还因未能通过博士学位答辩而遭受了罕见的侮辱。他在施乐帕洛阿尔托研究中心 (PARC) 任职,该中心正在开发第一台联网个人计算机 Alto。(他后来向哈佛大学提交了修改后的论文并获得了学位。)
Robert “Bob” Metcalfe (b. 1946) earned undergraduate degrees from MIT in 1969, and then started graduate school at Harvard, just when universities were first connecting to the ARPANET. Harvard resisted Metcalfe’s offer to take charge of its ARPANET connection, so he took a similar job at MIT. While the experience ignited Metcalfe’s passion for networking, it disconnected him from Harvard’s then skeletal computer science faculty, and he suffered the rare indignity of failing his PhD defense. He took a position at the Xerox Palo Alto Research Center (PARC), where the Alto, the first networked personal computer, was under development. (He later submitted a revised thesis to Harvard and was awarded the degree.)
在帕洛阿尔托研究中心,梅特卡夫了解了连接夏威夷群岛的无线分组 Aloha 网络,并意识到解决计算机与电线互连的微妙问题的解决方案是放松控制。计算机可以通过使用物理“分接头”(在电缆上任何方便的位置插入铜芯的尖峰)将它们连接到公共同轴电缆来联网。计算机通过将数据包转储到无源网络(消息将在两个方向上传播)并侦听网络中发送给它们的数据包来进行通信。David Boggs(生于 1950 年)是帕洛阿尔托研究中心的一位年轻无线电工程师,他将自己的无线工程专业知识应用到了梅特卡夫的想法中。由于联网的计算机不同步,数据包不可避免地会重叠或冲突,这种情况可以通过重传来处理。因此,同轴电缆类似于 17 世纪的“以太”,人们假设它可以解释真空中光的波动性。这个名字一直沿用至今,而且由于无源网络的理念得到了施乐公司低廉许可费的支持,以太网成为了普遍存在的标准。梅特卡夫和博格斯都取得了创业成功。
At PARC Metcalfe learned about the wireless packet Aloha Network connecting the Hawaiian islands and realized that the solution to the subtleties of interconnecting computers with wires was to relax control. Computers would be networked by connecting them to a common coaxial cable using a physical “tap”—a spike driven into the copper core at any convenient spot along the cable. Computers would communicate by dumping data packets onto the passive network, where the messages would spread out in both directions, and by listening to the network for packets meant for them. David Boggs (b. 1950), a young radio engineer also at PARC, applied his wireless engineering expertise to Metcalfe’s idea. Since the networked computers were not synchronized, packets would inevitably overlap or collide, a circumstance that could be handled by retransmission. So a coaxial cable was analogous to the “ether” that had in the seventeenth century been hypothesized to explain the wave nature of light in a vacuum. The name stuck, and because the idea of a passive network was backed up by low licensing fees from Xerox, Ethernet became a ubiquitous standard. Both Metcalfe and Boggs went on to entrepreneurial success.
O NE可以将分布式计算描述为一系列分散程度不同的活动,一个极端是远程计算机网络,另一个极端是多重处理。远程计算机网络是以前孤立的、广泛分离的、相当大的计算系统的松散互连。多处理是通过越来越多和更小的并行计算来构建以前的单片和串行计算系统。接近这个范围的中间的是本地网络,计算机互连以获得计算机网络的资源共享和多处理的并行性。
ONE can characterize distributed computing as a spectrum of activities varying in their degree of decentralization, with one extreme being remote computer networking and the other extreme being multiprocessing. Remote computer networking is the loose interconnection of previously isolated, widely separated, and rather large computing systems. Multiprocessing is the construction of previously monolithic and serial computing systems from increasingly numerous and smaller pieces computing in parallel. Near the middle of this spectrum is local networking, the interconnection of computers to gain the resource sharing of computer networking and the parallelism of multiprocessing.
计算机之间的分离及其通信的相关比特率可用于将分布式计算范围划分为广泛的活动。间隔和比特率的乘积现在约为每秒 1 吉比特米 (1 Gbmps),这表明了当前通信技术的极限,并且预计会随着时间的推移而增加(图 41.1 )。
The separation between computers and the associated bit rate of their communication can be used to divide the distributed computing spectrum into broad activities. The product of separation and bit rate, now about 1 gigabit-meter per second (1 Gbmps), is an indication of the limit of current communication technology and can be expected to increase with time (Figure 41.1).
夏威夷大学的阿罗哈网络最初是为了应用分组无线电技术在中央计算机与其分散在夏威夷群岛的终端之间进行通信而开发的。许多终端现在都是小型计算机,使用 Aloha Network 的 Menehune 作为数据包交换机进行相互通信。Menehune 和 ARPA NET Imp 现已连接,为 Aloha 网络上的终端提供对美国大陆计算资源的访问。
The Aloha Network at the University of Hawaii was originally developed to apply packet radio techniques for communication between a central computer and its terminals scattered among the Hawaiian Islands. Many of the terminals are now minicomputers communicating among themselves using the Aloha Network’s Menehune as a packet switch. The Menehune and an ARPANET Imp are now connected, providing terminals on the Aloha Network access to computing resources on the U.S. mainland.
正如计算机网络已经跨越大陆和海洋以互连世界各地的主要计算设施一样,它们现在也沿着走廊和建筑物之间发展以互连办公室和实验室中的小型计算机(Ashenhurst 和 Vonderohe,1975;Farber,1973,1975a,b;威拉德,1973)。
Just as computer networks have grown across continents and oceans to interconnect major computing facilities around the world, they are now growing down corridors and between buildings to interconnect minicomputers in offices and laboratories (Ashenhurst and Vonderohe, 1975; Farber, 1973, 1975a,b; Willard, 1973).
最近,小型计算机已以多处理器配置连接,以实现经济性、可靠性和增强的系统模块化(Ornstein 等人,1975 年;Wulf 和 Levin,1972 年)。趋势是为了可靠性而走向去中心化。松散耦合的多处理器系统较少依赖于共享中央内存,而更多地依赖于细线来进行进程间通信并增强组件隔离(Metcalfe,1972b;Roberts 和 Wessler,1970b)。随着为了可靠性而进行的处理器间通信的不断细化以及分布式应用程序的发展,多处理正在逐渐接近分布式计算的本地形式。
More recently minicomputers have been connected in multiprocessor configurations for economy, reliability, and increased system modularity (Ornstein et al., 1975; Wulf and Levin, 1972). The trend has been toward decentralization for reliability; loosely coupled multiprocessor systems depend less on shared central memory and more on thin wires for interprocess communication with increased component isolation (Metcalfe, 1972b; Roberts and Wessler, 1970b). With the continued thinning of interprocessor communication for reliability and the development of distributable applications, multiprocessing is gradually approaching a local form of distributed computing.
在详细描述以太网之前,我们提供以下概述(见图41.2)。
Before going into a detailed description of Ethernet, we offer the following overview (see Figure 41.2).
图 41.2: 两段以太网
Figure 41.2: A two-segment Ethernet
以太网是计算站之间本地通信的系统。我们的实验以太网使用分接同轴电缆在个人小型计算机、打印设备、大型文件存储设备、磁带备份站、大型中央计算机和长途通信设备之间传输可变长度的数字数据包。
Ethernet is a system for local communication among computing stations. Our experimental Ethernet uses tapped coaxial cables to carry variable-length digital data packets among, for example, personal minicomputers, printing facilities, large file storage devices, magnetic tape backup stations, larger central computers, and longer-haul communication equipment.
共享通信设施(分支以太)是被动的。站点的以太网接口通过接口电缆以位串行方式连接到收发器,而收发器又接入通过的以太网。数据包被广播到以太网上,被所有站听到,并由根据数据包的前导地址位选择数据包的目的地从以太网复制。这是广播分组交换,应当与存储转发分组交换区分开来,在存储转发分组交换中路由由中间处理元件执行。为了满足增长的需求,可以使用用于信号再生的数据包中继器、用于流量本地化的数据包过滤器以及用于网络地址扩展的数据包网关来扩展以太网。
The shared communication facility, a branching Ether, is passive. A station’s Ethernet interface connects bit-serially through an interface cable to a transceiver which in turn taps into the passing Ether. A packet is broadcast onto the Ether, is heard by all stations, and is copied from the Ether by destinations which select it according to the packet’s leading address bits. This is broadcast packet switching and should be distinguished from store-and-forward packet switching in which routing is performed by intermediate processing elements. To handle the demands of growth, an Ethernet can be extended using packet repeaters for signal regeneration, packet filters for traffic localization, and packet gateways for internetwork address extension.
控制完全分布在通过统计仲裁协调数据包传输的站之间。由站点发起的传输会推迟到任何可能已经在进行中的传输。一旦开始,如果检测到与其他数据包的干扰,则传输将中止并由其源站重新安排。经过一段时间的无干扰传输后,所有站都会听到一个数据包,并将在没有干扰的情况下运行完成。冲突站中的以太网控制器各自生成随机重传间隔,以避免重复冲突。数据包重传间隔的平均值根据冲突历史记录进行调整,以随着网络负载的变化保持以太网利用率接近最佳状态。
Control is completely distributed among stations with packet transmissions coordinated through statistical arbitration. Transmissions initiated by a station defer to any which may already be in progress. Once started, if interference with other packets is detected, a transmission is aborted and rescheduled by its source station. After a certain period of interference-free transmission, a packet is heard by all stations and will run to completion without interference. Ethernet controllers in colliding stations each generate random retransmission intervals to avoid repeated collisions. The mean of a packet’s retransmission intervals is adjusted as a function of collision history to keep Ether utilization near the optimum with changing network load.
即使在没有源检测到的干扰的情况下传输,数据包仍然可能无法准确无误地到达目的地;因此,数据包仅以高概率传送。要求残余错误率低于裸以太网数据包传输机制所提供的残留错误率的站必须遵循双方商定的数据包协议。
Even when transmitted without source-detected interference, a packet may still not reach its destination without error; thus, packets are delivered only with high probability. Stations requiring a residual error rate lower than that provided by the bare Ethernet packet transport mechanism must follow mutually agreed upon packet protocols.
我们的目标是设计一个可以平稳扩展的通信系统,以容纳多个装有个人计算机的建筑物及其支持所需的设施。
Our object is to design a communication system which can grow smoothly to accommodate several buildings full of personal computers and the facilities needed for their support.
与要连接的计算站一样,通信系统必须便宜。我们选择将通信设施的控制分布在通信计算机之间,以消除主动中央控制器的可靠性问题,避免在并行性丰富的系统中产生瓶颈,并降低使小型系统不经济的固定成本。
Like the computing stations to be connected, the communication system must be inexpensive. We choose to distribute control of the communications facility among the communicating computers to eliminate the reliability problems of an active central controller, to avoid creating a bottleneck in a system rich in parallelism, and to reduce the fixed costs which make small systems uneconomical.
以太网设计始于 Aloha 网络中开发的数据包冲突和重传的基本思想(Abramson,1970)。我们预计,像 Aloha 网络一样,以太网将承载突发流量,从而使传统的同步时分复用 (STDM) 效率低下(Abramson,1970、1973;Metcalfe,1973;Roberts 和 Wessler,1970a)。我们看到了 Aloha 方法对无线电信道复用和分布式控制的前景。希望它能够有效地应用于适合本地计算机通信的媒体。通过我们自己的多项创新,我们的承诺得以实现。
Ethernet design started with the basic idea of packet collision and retransmission developed in the Aloha Network (Abramson, 1970). We expected that, like the Aloha Network, Ethernets would carry bursty traffic so that conventional, synchronous time-division multiplexing (STDM) would be inefficient (Abramson, 1970, 1973; Metcalfe, 1973; Roberts and Wessler, 1970a). We saw promise in the Aloha approach to distributed control of radio channel multiplexing and hoped that it could be applied effectively with media suited to local computer communication. With several innovations of our own, the promise is realized.
以太网以历史上发光的以太命名,据称电磁辐射可以通过以太传播。与 Aloha 无线电发射器一样,以太网发射器将称为数据包的完全寻址发射器同步位序列广播到以太网上,并希望它们被预期的接收器听到。以太网是一种逻辑上用于传播数字信号的无源介质,可以使用任意数量的介质(包括同轴电缆、双绞线和光纤)来构建。
Ethernet is named for the historical luminiferous ether through which electromagnetic radiations were once alleged to propagate. Like an Aloha radio transmitter, an Ethernet transmitter broadcasts completely-addressed transmitter-synchronous bit sequences called packets onto the Ether and hopes that they are heard by the intended receivers. The Ether is a logically passive medium for the propagation of digital signals and can be constructed using any number of media including coaxial cables, twisted pairs, and optical fibers.
以太网的拓扑结构是一棵无根树的拓扑结构。它是一棵树,因此以太可以在建筑物走廊的入口处分支,同时避免多路径干扰。任何源和目的地之间必须只有一条通过以太网的路径;如果存在多条路径,传输就会干扰自身:经过不同长度的路径重复到达其预定目的地。以太是无根的,因为它可以从任何一点向任何方向延伸。任何希望加入以太网的站点都可以在最近的方便点接入以太网。
The topology of the Ethernet is that of an unrooted tree. It is a tree so that the Ether can branch at the entrance to a building’s corridor, yet avoid multipath interference. There must be only one path through the Ether between any source and destination; if more than one path were to exist, a transmission would interfere with itself: repeatedly arriving at its intended destination having travelled by paths of different length. The Ether is unrooted because it can be extended from any of its points in any direction. Any station wishing to join an Ethernet taps into the Ether at the nearest convenient point.
从互连和控制的关系来看,以太网是星型网络的对偶。以太网不像星形网络那样通过许多单独的链路进行分布式互连并在交换节点中进行集中控制,而是通过以太网进行集中互连并在其站之间进行分布式控制。
Looking at the relationship of interconnection and control, we see that Ethernet is the dual of a star network. Rather than distributed interconnection through many separate links and central control in a switching node, as in a star network, the Ethernet has central interconnection through the Ether and distributed control among its stations.
与 Aloha 网络(具有传出广播通道和传入多路访问通道的星形网络)不同,以太网支持通过单个广播多路访问通道进行多对多通信。
Unlike an Aloha Network which is a star network with an outgoing broadcast channel and an incoming multi-access channel, an Ethernet supports many-to-many communication with a single broadcast multi-access channel.
当以太币大部分未使用时;站点随意发送数据包,接收到的数据包没有错误,一切都很好。随着越来越多的站点开始传输,数据包干扰率也会增加。每个站都内置以太网控制器来调整平均重传间隔与碰撞频率成正比;因此,竞争的站间传输之间的以太共享保持在最佳状态。
When the Ether is largely unused; a station transmits its packets at will, the packets are received without error, and all is well. As more stations begin to transmit, the rate of packet interference increases. Ethernet controllers in each station are built to adjust the mean retransmission interval in proportion to the frequency of collisions; sharing of the Ether among competing station-station transmissions is thereby kept near the optimum.
各站之间需要一定程度的合作才能公平地共享以太币。在要求较高的应用中,某些站可能会通过一些系统性违反公平规则的行为来有效地获得传输优先权。站点可能会通过不随着流量增加而调整重传间隔或发送非常大的数据包来篡夺以太网。现在,这两种做法都被每个站的低级软件所禁止。
A degree of cooperation among the stations is required to share the Ether equitably. In demanding applications certain stations might usefully take transmission priority through some systematic violation of equity rules. A station could usurp the Ether by not adjusting its retransmission interval with increasing traffic or by sending very large packets. Both practices are now prohibited by low level software in each station.
以太网尽最大努力成功传输数据包,但源站和目标站中的进程有责任采取必要的预防措施,以确保可靠的通信达到其自身所需的质量(Metcalfe,1972b,1973)。认识到承诺“无错误”通信的成本和危险,我们避免保证任何单个数据包的可靠传输,以实现多个数据包的平均传输经济性和高可靠性(Metcalfe,1973)。从数据包传输机制中消除可靠通信的责任使我们能够根据应用程序定制可靠性,并将错误恢复放在最能发挥作用的地方。由于以太网在网络层次结构中互连,因此该策略变得更加重要,数据包必须在网络层次结构中传输更远的距离并承受更大的风险。
An Ethernet gives its best efforts to transmit packets successfully, but it is the responsibility of processes in the source and destination stations to take the precautions necessary to assure reliable communication of the quality they themselves desire (Metcalfe, 1972b, 1973). Recognizing the costliness and dangers of promising “error-free” communication, we refrain from guaranteeing reliable delivery of any single packet to get both economy of transmission and high reliability averaged over many packets (Metcalfe, 1973). Removing the responsibility for reliable communication from the packet transport mechanism allows us to tailor reliability to the application and to place error recovery where it will do the most good. This policy becomes more important as Ethernets are interconnected in a hierarchy of networks through which packets must travel farther and suffer greater risks.
我们的实验以太网提供了五种机制来降低丢失数据包的概率和成本。它们是 (1) 载波检测、(2) 干扰检测、(3) 数据包错误检测、(4) 截断数据包过滤和 (5) 冲突共识执行。
Five mechanisms are provided in our experimental Ethernet for reducing the probability and cost of losing a packet. These are (1) carrier detection, (2) interference detection, (3) packet error detection, (4) truncated packet filtering, and (5) collision consensus enforcement.
41.3.5.1 载波检测 当数据包的比特由站点放置在以太网上时,它们会被进行相位编码(就像磁带上的比特一样),这保证了在每个比特时间内以太网上至少有一次转换。因此可以检测到以太网上数据包的传递通过聆听其转变。使用无线电类比,当数据包通过收发器时,我们会谈到载波的存在。由于站点可以感知正在通过的数据包的载波,因此它可以延迟发送自己的载波,直到检测到的数据包安全通过。Aloha 网络没有运营商检测,因此冲突率要高得多。如果没有载波检测,以太网的有效使用将随着数据包长度的增加而降低。在下面的第41.6节中,我们展示了通过载波检测,以太网效率随着数据包长度的增加而增加。
41.3.5.1 Carrier detection As a packet’s bits are placed on the Ether by a station, they are phase encoded (like bits on a magnetic tape), which guarantees that there is at least one transition on the Ether during each bit time. The passing of a packet on the Ether can therefore be detected by listening for its transitions. To use a radio analogy, we speak of the presence of carrier as a packet passes a transceiver. Because a station can sense the carrier of a passing packet, it can delay sending one of its own until the detected packet passes safely. The Aloha Network does not have carrier detection and consequently suffers a substantially higher collision rate. Without carrier detection, efficient use of the Ether would decrease with increasing packet length. In §41.6 below, we show that with carrier detection, Ether efficiency increases with increasing packet length.
通过载波检测,我们能够实现尊重:没有电台在听到载波时不会开始传输。随着尊重而来的是获取:一旦数据包传输已经进行了以太端到端传播时间,所有站都会听到载波并推迟;以太币已被获取并且传输将在没有干扰冲突的情况下完成。通过载波检测,只有当两个或多个站点发现以太网处于静默状态并开始同时传输时(在以太网端到端传播时间内),才会发生冲突。这几乎总是在两个或多个站延迟的数据包传输之后立即发生。由于站点现在不会在延迟后进行随机化,因此当传输终止时,等待的站点会堆积在一起、发生冲突、随机化并重新传输。
With carrier detection we are able to implement deference: no station will start transmitting while hearing carrier. With deference comes acquisition: once a packet transmission has been in progress for an Ether end-to-end propagation time, all stations are hearing carrier and are deferring; the Ether has been acquired and the transmission will complete without an interfering collision. With carrier detection, collisions should occur only when two or more stations find the Ether silent and begin transmitting simultaneously: within an Ether end-to-end propagation time. This will almost always happen immediately after a packet transmission during which two or more stations were deferring. Because stations do not now randomize after deferring, when the transmission terminates, the waiting stations pile on together, collide, randomize, and retransmit.
41.3.5.2 干扰检测 每个收发器都有一个干扰检测器。当收发器注意到从以太网接收的比特值与尝试发送的比特值之间存在差异时,就表明存在干扰。
41.3.5.2 Interference detection Each transceiver has an interference detector. Interference is indicated when the transceiver notices a difference between the value of the bit it is receiving from the Ether and the value of the bit it is attempting to transmit.
干扰检测具有三个优点。首先,检测到冲突的站点知道其数据包已被损坏。可以安排数据包立即重传,避免长时间的确认超时。其次,以太坊上的干扰周期被限制为最多一个往返时间。Aloha 网络中的冲突数据包会一直运行到完成,但因以太网冲突而被截断的数据包仅浪费以太网上数据包时间的一小部分。第三,检测到的干扰频率用于估计以太网流量,以调整重传间隔并优化信道效率。
Interference detection has three advantages. First, a station detecting a collision knows that its packet has been damaged. The packet can be scheduled for retransmission immediately, avoiding a long acknowledgement timeout. Second, interference periods on the Ether are limited to a maximum of one round trip time. Colliding packets in the Aloha Network run to completion, but the truncated packets resulting from Ethernet collisions waste only a small fraction of a packet time on the Ether. Third, the frequency of detected interference is used to estimate Ether traffic for adjusting retransmission intervals and optimizing channel efficiency.
41.3.5.3 数据包错误检测 当数据包被放置在以太网上时,会计算并附加校验和。当从以太网读取数据包时,会重新计算校验和。不携带一致校验和的数据包将被丢弃。通过这种方式,传输错误、脉冲噪声错误和由于未检测到的干扰而导致的错误都会在数据包的目的地被捕获。
41.3.5.3 Packet error detection As a packet is placed on the Ether, a checksum is computed and appended. As the packet is read from the Ether, the checksum is recomputed. Packets which do not carry a consistent checksum are discarded. In this way transmission errors, impulse noise errors and errors due to undetected interference are caught at a packet’s destination.
41.3.5.4 截断数据包过滤 干扰检测和干扰导致大多数冲突,导致数据包被截断,只有几个位。冲突站检测到干扰并在以太往返时间内中止传输。为了减少拒绝此类明显损坏的数据包会给侦听站软件带来的处理负载,在硬件中过滤掉截断的数据包。
41.3.5.4 Truncated packet filtering Interference detection and deference cause most collisions to result in truncated packets of only a few bits; colliding stations detect interference and abort transmission within an Ether round-trip time. To reduce the processing load that the rejection of such obviously damaged packets would place on listening station software, truncated packets are filtered out in hardware.
41.3.5.5 冲突共识执行 当一个站点确定其传输受到干扰时,它会暂时干扰以太以确保冲突中的所有其他参与者都将检测到干扰,并且由于服从而被迫中止。如果没有这种冲突共识执行机制,发送站可能会否则,最后一个检测到冲突的传输可能不会这样做,因为其他干扰传输会相继中止并停止干扰。尽管数据包对于最后一个发送器来说可能看起来不错,但冲突的发送器和预期接收器之间的不同路径长度将导致数据包到达时损坏。
41.3.5.5 Collision consensus enforcement When a station determines that its transmission is experiencing interference, it momentarily jams the Ether to insure that all other participants in the collision will detect interference and, because of deference, will be forced to abort. Without this collision consensus enforcement mechanism, it is possible that the transmitting station which would otherwise be the last to detect a collision might not do so as the other interfering transmissions successively abort and stop interfering. Although the packet may look good to that last transmitter, different path lengths between the colliding transmitters and the intended receiver will cause the packet to arrive damaged.
我们选择 1 公里、每秒 3 兆比特和 256 个站作为实验以太网的参数,是基于本地分布式计算机通信环境的特征以及我们对勉强可实现的目标的评估;它们当然不是以太网概念所必需的硬性限制。
Our choices of 1 kilometer, 3 megabits per second, and 256 stations for the parameters of an experimental Ethernet were based on characteristics of the locally-distributed computer communication environment and our assessments of what would be marginally achievable; they were certainly not hard restrictions essential to the Ethernet concept.
我们预计合理的最大网络规模约为 1 公里电缆。我们使用这个工作数在不同信号衰减的以太网中进行选择,并设计具有适当功率和灵敏度的收发器。
We expected that a reasonable maximum network size would be on the order of 1 kilometer of cable. We used this working number to choose among Ethers of varying signal attenuation and to design transceivers with appropriate power and sensitivity.
我们实验以太网上的主要站点是一台小型计算机,每秒 3 兆位的数据传输速率是很方便的。通过将峰值速率保持在远低于计算机到主内存的路径的速率,我们减少了以太网接口中对昂贵的专用数据包缓冲的需求。通过保持尽可能高的峰值速率,我们可以提供更多数量的站和更雄心勃勃的多处理通信应用。
The dominant station on our experimental Ethernet is a minicomputer for which 3 megabits per second is a convenient data transfer rate. By keeping the peak rate well below that of the computer’s path to main memory, we reduce the need for expensive special-purpose packet buffering in our Ethernet interfaces. By keeping the peak rate as high as is convenient, we provide for larger numbers of stations and more ambitious multiprocessing communications applications.
为了加快 256 个站之间的低级数据包处理,我们将数据包的第一个 8 位字节分配为目标地址字段,将第二个字节分配为源地址字段(见图41.3)。256 是一个足够小的数字,足以让每个站获得足够的可用带宽份额,并且接近我们使用当前分接电缆技术所能实现的极限。256只是最低层协议的一个方便的数字;更高级别可以容纳扩展的地址空间以及数据包内的附加字段和软件来解释它们。
To expedite low-level packet handling among 256 stations, we allocate the first 8-bit byte of the packet to be the destination address field and the second byte to be the source address field (see Figure 41.3). 256 is a number small enough to allow each station to get an adequate share of the available bandwidth and approaches the limit of what we can achieve with current techniques for tapping cables. 256 is only a convenient number for the lowest level of protocol; higher levels can accommodate extended address spaces with additional fields inside the packet and software to interpret them.
图 41.3: 以太网数据包布局
Figure 41.3: Ethernet packet layout
我们的实验性以太网实现有四个主要部分:以太网、收发器、接口和控制器(见图41.2)。
Our experimental Ethernet implementation has four major parts: the Ether, transceivers, interfaces, and controllers (see Figure 41.2).
以太网收发器直接连接到穿过天花板或地板下的以太网。它通过传输数据、接收数据、干扰检测和电源电压的接口电缆中的 5 条双绞线进行供电和控制。当断电时,收发器会自行断开与以太网的连接。这就是我们为可靠性而奋斗的胜败所在;损坏的收发器可以(但不应该)导致整个以太网瘫痪。每个收发器中的看门狗定时器电路会在行为可疑时关闭输出级,从而尝试防止以太网污染。为了简化收发器,我们使用以太网的基频带,但可以构建以太网以使用频分复用以太网的任何适当大小的频带。……
An Ethernet transceiver attaches directly to the Ether which passes by in the ceiling or under the floor. It is powered and controlled through 5 twisted pairs in an interface cable carrying transmit data, receive data, interference detect, and power supply voltages. When unpowered, the transceiver disconnects itself electrically from the Ether. Here is where our fight for reliability is won or lost; a broken transceiver can, but should not, bring down an entire Ethernet. A watchdog timer circuit in each transceiver attempts to prevent pollution of the Ether by shutting down the output stage if it acts suspiciously. For transceiver simplicity we use the Ether’s base frequency band, but an Ethernet could be built to use any suitably sized band of a frequency division multiplexed Ether. …
发送接口使用数据包缓冲区地址和字计数来对可变数量的 16 位字进行序列化和相位编码,这些字从站的存储器中取出并传递到收发器,前面有一个起始位(在图 41.3 中称为 SYNC )和其次是CRC。接收接口使用载波的出现来检测数据包的开始,并使用SYNC位来获取位相位。只要载波保持开启状态,接口就会对传入的比特流进行解码和反序列化,将 16 位字存储在站主存储器的数据包缓冲区中。当载波消失时,接口检查是否已接收到整数个 16 位字以及 CRC 是否正确。假定接收到的最后一个字是 CRC,并且不会复制到数据包缓冲区中。
A transmitting interface uses a packet buffer address and word count to serialize and phase encode a variable number of 16-bit words which are taken from the station’s memory and passed to the transceiver, preceded by a start bit (called SYNC in Figure 41.3) and followed by the CRC. A receiving interface uses the appearance of carrier to detect the start of a packet and uses the SYNC bit to acquire bit phase. As long as carrier stays on, the interface decodes and deserializes the incoming bit stream depositing 16-bit words in a packet buffer in the station’s main memory. When carrier goes away, the interface checks that an integral number of 16-bit words has been received and that the CRC is correct. The last word received is assumed to be the CRC and is not copied into the packet buffer.
这些接口通常包括用于仅接受标头中具有适当地址的那些数据包的硬件。当以太网非常忙于承载发往其他站点的流量时,硬件地址过滤可帮助站点避免繁重的软件数据包处理。
These interfaces ordinarily include hardware for accepting only those packets with appropriate addresses in their headers. Hardware address filtering helps a station avoid burdensome software packet processing when the Ether is very busy carrying traffic intended for other stations.
重传间隔是一个时隙的倍数、开始传输和检测到冲突之间的最大时间、一个端到端往返延迟。以太网控制器以一个时隙的平均重传间隔开始传输每个新数据包。每次传输尝试以冲突结束时,控制器都会延迟一个随机长度的时间间隔(平均为前一个时间间隔的两倍),遵循任何通过的数据包,然后尝试重新传输。这种启发式近似于我们称为二元指数退避的算法(见图41.4)。
Retransmission intervals are multiples of a slot, the maximum time between starting a transmission and detecting a collision, one end-to-end round trip delay. An Ethernet controller begins transmission of each new packet with a mean retransmission interval of one slot. Each time a transmission attempt ends in collision, the controller delays for an interval of random length with a mean twice that of the previous interval, defers to any passing packet, and then attempts retransmission. This heuristic approximates an algorithm we have called Binary Exponential Backoff (see Figure 41.4).
图41.4: 碰撞控制算法
Figure 41.4: Collision control algorithm
当网络空载且冲突很少时,平均值很少偏离 1,并且会迅速重传。随着流量负载的增加,会遇到更多的冲突,站点中会积压数据包,重传间隔会增加,并且重传流量会回退以维持信道效率。
When the network is unloaded and collisions are rare, the mean seldom departs from one and retransmissions are prompt. As the traffic load increases, more collisions are experienced, a backlog of packets builds up in the stations, retransmission intervals increase, and retransmission traffic backs off to sustain channel efficiency.
我们运行一个实验性的两段数据包转发器,但希望避免依赖它们。在对以太进行分支并扩展其信号覆盖范围时,需要在使用复杂的收发器和中继器之间进行权衡。随着功率和灵敏度的提高,收发器变得更加昂贵且可靠性降低。在以太网中引入中继器使得集中互连的以太网变得活跃。收发器的故障将切断其所有者的通信;中继器的故障会导致以太网分区,导致许多通信中断。
We operate an experimental two-segment packet repeater, but hope to avoid relying on them. In branching the Ether and extending its signal cover, there is a trade-off between using sophisticated transceivers and using repeaters. With increased power and sensitivity, transceivers become more expensive and less reliable. The introduction of repeaters into an Ethernet makes the centrally interconnecting Ether active. The failure of a transceiver will sever the communications of its owner; the failure of a repeater partitions the Ether severing many communications.
连接两个以太段的中继器或包过滤器只能有一个;一个数据包被多个中继器重复到一个网段上会产生干扰。但是,连接两个网段的网关数量没有限制;网关仅重复发送给作为中介的自身的数据包。连接两个网段的单个中继器发生故障会导致网络分区;如果网段之间存在通过其他网关的路径,则网关故障无需对网络进行分区。
There can be only one repeater or packet filter connecting two Ether segments; a packet repeated onto a segment by multiple repeaters would interfere with itself. However, there is no limit to the number of gateways connecting two segments; a gateway only repeats packets addressed to itself as an intermediary. Failure of the single repeater connecting two segments partitions the network; failure of a gateway need not partition the net if there are paths through other gateways between the segments.
...我们通过检查交替的以太时间周期开发了一个负载以太网性能的简单模型。第一个称为传输间隔,在此期间为成功的数据包传输获取以太币。第二个称为竞争间隔,由第41.4.4节的重传时隙组成,在此期间站尝试获取以太网的控制权。由于该模型的以太网已加载,并且站点在开始传输之前推迟传递数据包,因此时隙通过前一个采集间隔的尾部进行同步。当没有站选择尝试在其中传输时,时隙将为空,并且如果多个站尝试传输,则该时隙将包含冲突。当一个时隙仅包含一次尝试的传输时,则在数据包的持续时间内已获取以太网,竞争间隔结束,传输间隔开始。
… We develop a simple model of the performance of a loaded Ethernet by examining alternating Ether time periods. The first, called a transmission interval, is that during which the Ether has been acquired for a successful packet transmission. The second, called a contention interval, is that composed of the retransmission slots of §41.4.4, during which stations attempt to acquire control of the Ether. Because the model’s Ethernets are loaded and because stations defer to passing packets before starting transmission, the slots are synchronized by the tail of the preceding acquisition interval. A slot will be empty when no station chooses to attempt transmission in it and it will contain a collision if more than one station attempts to transmit. When a slot contains only one attempted transmission, then the Ether has been acquired for the duration of a packet, the contention interval ends, and a transmission interval begins.
令P为以太网数据包中的位数。令C为以太坊上每秒的峰值容量(以比特为单位)。令T为时隙的时间(以秒为单位),即开始传输后检测到冲突所需的秒数。假设有Q个站连续排队传输数据包;要么获取站在成功获取后立即有一个新数据包,要么另一个站准备好。请注意,Q也恰好给出了网络上提供的总负载,对于此分析,该总负载始终为 1 或更大。我们假设排队站尝试在当前时隙中以 1 /Q的概率进行传输,或者以 1 − (1 /Q ) 的概率进行延迟;这被认为是最佳统计决策规则,通过我们的负载估计重传控制算法在以太网站中近似。
Let P be the number of bits in an Ethernet packet. Let C be the peak capacity in bits per second, carried on the Ether. Let T be the time in seconds of a slot, the number of seconds it takes to detect a collision after starting a transmission. Let us assume that there are Q stations continuously queued to transmit a packet; either the acquiring station has a new packet immediately after a successful acquisition or another station comes ready. Note that Q also happens to give the total offered load on the network which for this analysis is always 1 or greater. We assume that a queued station attempts to transmit in the current slot with probability 1/Q, or delays with probability 1 − (1/Q); this is known to be the optimum statistical decision rule, approximated in Ethernet stations by means of our load-estimating retransmission control algorithms.
对于大小超过 4000 位的数据包,我们实验性以太网的效率保持在 95% 以上。对于大小接近时隙大小的数据包,以太网效率接近 1 /e,即时隙 Aloha 网络的渐近效率(Roberts,1973)。
For packets whose size is above 4000 bits, the efficiency of our experimental Ethernet stays well above 95 percent. For packets with a size approximating that of a slot, Ethernet efficiency approaches 1/e, the asymptotic efficiency of a slotted Aloha Network (Roberts, 1973).
构建可行的分组通信系统不仅仅是提供分组传输机制。还必须通过在上述第41.3和41.4节中描述的以太控制协议之上实现的更高级别协议来提供纠错、流量控制、进程命名、安全性和记账的方法(Cerf 和 Kahn,1974 年;Crocker 等人,2014 年)。 ,1972;法伯,1973;梅特卡夫,1973;罗,1975;瓦尔登,1972)。以太网控制包括数据包成帧、错误检测、寻址和多路访问控制;与其他线路控制程序一样,以太网用于支持多种网络和多处理器架构(IBM,1974,1975)。
There is more to the construction of a viable packet communication system than simply providing the mechanisms for packet transport. Methods for error correction, flow control, process naming, security, and accounting must also be provided through higher level protocols implemented on top of the Ether control protocol described in §§41.3 and 41.4 above (Cerf and Kahn, 1974; Crocker et al., 1972; Farber, 1973; Metcalfe, 1973; Rowe, 1975; Walden, 1972). Ether control includes packet framing, error detection, addressing and multi-access control; like other line control procedures, Ethernet is used to support numerous network and multiprocessor architectures (IBM, 1974, 1975).
下面是对一种简单的错误控制数据包协议的简要描述。EFTP(以太网文件传输协议)之所以引起人们的兴趣,不仅因为它相对容易理解和正确实现,而且因为它在更通用和更高效的协议的开发过程中尽职尽责地承载了许多有价值的文件。
Here is a brief description of one simple error-controlling packet protocol. The EFTP (Ethernet File Transfer Protocol) is of interest both because it is relatively easy to understand and implement correctly and because it has dutifully carried many valuable files during the development of more general and efficient protocols.
图 41.5: EFTP 数据包布局
Figure 41.5: EFTP packet layout
很明显,我们很少注意将某些字段填充到正确的位数中。这里的重点是编程的简单性和易用性。尽管有这样的免责声明,我们确实认为在宽敞的场地上犯错误更为明智。无论你如何尝试,一个或另一个字段总是会变得太小。
It should be obvious that little care has been taken to cram certain fields into just the right number of bits. The emphasis here is on simplicity and ease of programming. Despite this disclaimer, we do feel that it is more advisable to err on the side of spacious fields; try as you may, one field or another will always turn out to be too small.
软件校验和字用于降低未检测到错误的概率。它不仅用作实验以太网串行硬件 16 位循环冗余校验和(图 41.3中)的备份,而且还用于防止站内未经 CRC 检查的并行数据路径发生故障。EFTP 使用的校验和是 1 的补码添加并循环整个数据包,包括标头和内容数据。校验和可以被忽略,但用户在任一端都将承担风险;发送方可以将全 1(不可能的值)放入校验和字中,以向接收方指示未计算校验和。
The software checksum word is used to lower the probability of an undetected error. It serves not only as a backup for the experimental Ethernet’s serial hardware 16-bit cyclic redundancy checksum (in Figure 41.3), but also for protection against failures in parallel data paths within stations which are not checked by the CRC. The checksum used by the EFTP is a 1’s complement add and cycle over the entire packet, including header and content data. The checksum can be ignored at the user’s peril at either end; the sender may put all 1’s (an impossible value) into the checksum word to indicate to the receiver that no checksum was computed.
41.7.2.1 数据传输 文件的 16 位字通过从 0 开始连续编号的数据包从发送站传送到接收站。每个数据包由发送方定期重传,直到从发送站返回具有匹配序列号的 ack 数据包。收件人。接收方将忽略所有损坏的数据包、来自发送方以外的站点的数据包以及序列号与预期序列号或前一序列号不匹配的数据包。当数据包具有预期的序列号时,该数据包将被确认,其数据被接受为文件的一部分,并且序列号会递增。当数据包到达时其序列号比预期值小 1 时,它会被确认并被丢弃;假设其 ack 丢失并需要重传。
41.7.2.1 Data transfer The 16-bit words of a file are carried from sending station to receiving station in data packets consecutively numbered from 0. Each data packet is retransmitted periodically by the sender until an ack packet with a matching sequence number is returned from the receiver. The receiver ignores all damaged packets, packets from a station other than the sender, and packets whose sequence number does not match either the expected one or the one preceding. When a packet has the expected sequence number, the packet is acked, its data is accepted as part of the file, and the sequence number is incremented. When a packet arrives with a sequence number one less than that expected, it is acknowledged and discarded; the presumption is that its ack was lost and needs retransmission.
41.7.2.2 结束 当所有数据都发送完毕后,发送带有下一个连续序列号的结束数据包,然后发送方等待匹配的结束回复。按顺序接受结束数据包后,数据接收器以匹配的结束回复进行响应,然后拖延一段相当长的时间(10 秒)。收到结束回复后,发送站会发送回显结束回复,并可以自由地离开,以确保文件已成功传输。然后,磨蹭的接收者会收到回响的结束回复,并且也会放心地离开。
41.7.2.2 End When all the data has been transmitted, an end packet is sent with the next consecutive sequence number and then the sender waits for a matching endreply. Having accepted an end packet in sequence, the data receiver responds with a matching endreply and then dallys for some reasonably long period of time (10 seconds). Upon getting the endreply, the sending station transmits an echoing endreply and is free to go off with the assurance that the file has been transferred successfully. The dallying receiver then gets the echoed endreply and it too goes off assured.
相对复杂的结束延迟序列旨在实际上确保文件的发送者和接收者将就文件是否已正确传输达成一致。如果最终数据包丢失,数据发送方只需重新传输它,就像任何具有逾期确认的数据包一样。如果来自数据接收方的结束应答丢失,数据发送方将以同样的方式超时并重新发送结束数据包,该数据包又将被调拨接收方确认。如果回显的结束应答丢失,则等待的接收者将很不方便,必须等待它,但是当超时时,接收者仍然可以确保文件的成功传输,因为结束数据包已被接收。
The comparatively complex end-dally sequence is intended to make it practically certain that the sender and receiver of a file will agree on whether the file has been transmitted correctly. If the end packet is lost, the data sender simply retransmits it as it would any packet with an overdue acknowledgement. If the endreply from the data receiver is lost, the data sender will time out in the same way and retransmit the end packet which will in turn be acknowledged by the dallying receiver. If the echoed endreply is lost, the dallying receiver will be inconvenienced having to wait for it, but when it has timed out, the receiver can nevertheless be assured of successful transfer of the file because the end packet has been received.
在这一切过程中的任何时候,任何一方都可以自由地决定沟通失败并放弃;在发生用户发起的中止或文件系统错误等情况下,发送中止数据包以立即结束通信被认为是礼貌的。
At any time during all of this, either side is free to decide communication has failed and just give up; it is considered polite to send an abort packet to end the communication promptly in the event of, say, a user-initiated abort or a file system error.
41.7.2.3 EFTP 缺点 EFTP 非常有用,但它的缺点也很多。首先,该协议仅提供单个网络中从站到站的文件传输,特别是不提供同一网络上或通过网关的站内进程到进程的文件传输。其次,进程集合点是退化的,因为没有通过名称查找进程或通过单个服务器方便地处理多个用户的机制。第三,没有真正的流量控制。如果数据到达接收器而无法将其接收到其缓冲区中,则可以简单地丢弃该数据,并完全保证它最终会被重新传输。接收器无法消除此类浪费的传输流或加速重传。第四,数据以属于未命名文件的整数个 16 位字进行传输,因此 EFTP 要么具有极大的限制性,要么要求其数据字内部有一些嵌套文件传输格式。第五,由于接收者同时也是监听者和服务器,因此失去了功能通用性。
41.7.2.3 EFTP shortcomings The EFTP has been very useful, but its shortcomings are many. First, the protocol provides only for file transfer from station to station in a single network and specifically not from process to process within stations either on the same network or through a gateway. Second, process rendezvous is degenerate in that there are no mechanisms for finding processes by name or for convenient handling of multiple users by a single server. Third, there is no real flow control. If data arrives at a receiver unable to accept it into its buffers, the data can simply be thrown away with complete assurance that it will be retransmitted eventually. There is no way for a receiver to quench the flow of such wasted transmissions or to expedite retransmission. Fourth, data is transmitted in integral numbers of 16-bit words belonging to unnamed files and thus the EFTP is either terribly restrictive or demands some nested file transfer formats internal to its data words. And fifth, functional generality is lost because the receiver is also the listener and server.
我们在运行以太网方面的经验使我们得出结论,我们对分布式控制的重视是正确的。通过将通信系统的共享组件保持在最低限度和无源性,我们实现了非常高的可靠性水平。我们的实验性以太网的安装和维护非常令人满意。广播分组交换提供的站互连的灵活性促进了众多计算机网络和多处理应用的发展。……
Our experience with an operating Ethernet leads us to conclude that our emphasis on distributed control was well placed. By keeping the shared components of the communication system to a minimum and passive, we have achieved a very high level of reliability. Installation and maintenance of our experimental Ethernet have been more than satisfactory. The flexibility of station interconnection provided by broadcast packet switching has encouraged the development of numerous computer networking and multiprocessing applications. …
经计算机协会许可,由 Metcalfe 和 Boggs (1976) 转载。
Reprinted from Metcalfe and Boggs (1976), with permission from the Association for Computing Machinery.
许多重要的论文在刚发表时似乎并不重要。就连阿兰·图灵在发表希尔伯特问题的解决方案几个月后也感到不受重视。1937 年 2 月,他写信给他的母亲,说伟人都忽视了他,他只收到了两次重印请求,其中一次来自一位已经知道证明的剑桥同事(Hodges,1983,第 124 页)。很少有论文像《密码学新方向》那样大胆地宣称他们正在报道“一场革命的边缘”。
Many important papers did not seem important when they were first published. Even Alan Turing felt unappreciated a few months after publishing his resolution of Hilbert’s Entscheidungsproblem. He wrote his mother in February of 1937 that the greats were ignoring him and that he had gotten only two requests for reprints, one from a Cambridge colleague who already knew the proof (Hodges, 1983, p. 124). Even fewer papers announce as boldly as “New Directions in Cryptography” that they are reporting from “the brink of a revolution.”
Whitfield Diffie(生于 1944 年)和 Martin Hellman(生于 1945 年)讨论的问题很容易表述。首先是密钥分发。自古以来,人们就一直在互相发送密码信息。一方借助加密密钥对消息进行编码,另一方收到消息后使用相同的密钥对其进行解密。任何在传输过程中拦截消息的人都无法在没有密钥的情况下破译它。问题在于,从根本上安全地从一方向另一方传输密钥并不比安全地发送未编码的消息本身更容易。密钥可能更短,因此更容易隐藏,或者两方可以提前会面,在分道扬镳之前确定密钥。但共享加密密钥的根本问题的最终影响是使安全加密成为强者和富人的垄断——只有有能力保护密钥的实体才能可靠地交换秘密消息。
The problems discussed by Whitfield Diffie (b. 1944) and Martin Hellman (b. 1945) are easy to state. The first is key distribution. Since time immemorial, people have been sending coded messages to each other. One party encodes the message with the aid of an encryption key, and the other party, having received the message, uses the same key to decrypt it. Anyone who intercepts the message in transit can’t decipher it without the key. The problem is that it is fundamentally no easier to transmit the key securely from one party to the other than it would have been to send securely the unencoded message itself. The key may be shorter and therefore easier to hide, or perhaps the two parties can meet up ahead of time to settle on a key before they go their separate ways. But the net effect of the fundamental problem of sharing encryption keys was to make secure encryption a monopoly of the powerful and the wealthy—only entities with the capacity to protect the keys could reliably exchange secret messages.
第二个问题是数字签名:如何分发不可伪造的数字消息或文档,以便接收者可以确信没有人冒充发送者。
The second problem is digital signatures: how to distribute an unforgeable digital message or document so the recipient can be confident that no one has impersonated the sender.
迪菲和赫尔曼解决了密钥分配问题,提出了一种想法,使远方的各方能够通过不安全的通道就密钥达成一致,而无需传输密钥本身——甚至香农在他对保密理论的统一处理中,似乎也认为如果没有密钥就不可能实现这一点。仔细考虑一下:“密钥必须通过不可拦截的方式从传输点传输到接收点”(Shannon,1949,第 670 页)。数字签名问题被证明是相关的,尽管本文中针对数字签名提供的具体建议在数学上与针对密钥分发提供的建议不同。迪菲和赫尔曼希望对这两个问题有一个统一的解决方案,但这必须等待(见第 45 章)。
Diffie and Hellman address the key distribution problem with an idea for enabling parties at a distance to agree on a key over an insecure channel without transmitting the key itself—something even Shannon, in his unifying treatment of secrecy theory, seemed to dismiss as impossible without considering it carefully: “The key must be transmitted by non-interceptible means from transmitting to receiving points” (Shannon, 1949, p. 670). The digital signature problem turned out to be related, though the particular proposals offered in this paper for digital signatures are mathematically different from those offered for key distribution. Diffie and Hellman hoped for a unitary solution to both problems, but that would have to wait (see chapter 45).
尽管《新方向》令人惊叹,而且学术界很快就认识到它提出了各种重要的挑战,但它与许多早期的工作有关,但当时的作者并不知道所有这些工作。Diffie (1988) 认为 Ralph Merkle 提出了1974 年,作为伯克利大学的一名本科生,他提出了基本的公钥想法,但无法让任何人听他的。一些公钥密码学的想法是由英国情报机构 GCHQ 的科学家在 20 世纪 70 年代发现的,这些工作长期以来一直处于机密状态,因此在学术界或工业界并不为人所知。
As stunning as “New Directions” was—and it was quickly recognized in academic circles as raising a variety of important challenges—it was related to a variety of earlier work, not all of it known to the authors at the time. Diffie (1988) credits Ralph Merkle with coming up with the basic public-key idea as an undergraduate at Berkeley in 1974—but being unable to get anyone to listen to him. And some of the public key cryptography ideas had been discovered during the 1970s by scientists at GCHQ, the British intelligence agency, work that long remained classified and therefore unknown in academic or industrial circles.
但细节很重要,迪菲和赫尔曼不仅提供了框架,还提供了细节,包括建议使用模幂作为其加密函数的基础。如果离散对数问题(模幂的逆问题)很容易解决,那么 Diffie-Helman 协议就可以被破解。因此,人们开始寻找离散对数的快速算法。突然间,密码学研究成为主流,而著名数学家 GH Hardy 自豪地宣称永远不会对任何人产生实际用途的数论,却成为数学中最相关的领域。(尚未有人找到一种快速的离散对数算法,但在 20 世纪 80 年代,Adi Shamir [1984] 发现本文中使用背包问题作为单向数字签名算法基础的提议存在致命缺陷。)
But details matter, and Diffie and Hellman furnished not just the framework but the details, including the proposal to use modular exponentiation as the basis for their encryption function. If the discrete logarithm problem—the inverse of modular exponentiation—is easy to solve, then the Diffie–Helman protocol can be cracked. Hence the search was on for a fast algorithm for discrete logarithms. Suddenly, cryptography research became mainstream, and number theory, which the eminent mathematician G. H. Hardy had proudly proclaimed would never be of practical use to anyone, became the most relevant field of mathematics. (No one has yet found a fast algorithm for discrete logarithms, but in the 1980s Adi Shamir [1984] discovered that the proposal in this paper to use the knapsack problem as the basis for a one-way digital signature algorithm was fatally flawed.)
政府和商业界对公钥密码学重要性的认识比学术界慢得多。事实上,这篇论文的意义不仅在于它的技术重要性,还在于它开辟了一个新的社会政治世界。几个世纪以来,密码学一直是知识界的死水,因为人们普遍认为政府机构对已知的一切保密。迪菲和赫尔曼的论文标志着密码学思想公开交流的开始。
Government and commerce were much slower than academia to acknowledge the importance of public-key cryptography. The significance of this paper, indeed, is as much for opening a new sociopolitical world as for its technical importance. Cryptography had been an intellectual backwater for centuries, because it was widely assumed that whatever was known was being kept under wraps by government agencies. Diffie and Hellman’s paper marked the beginning of the open exchange of ideas about cryptography.
它开始将加密工具交到普通公民手中——一旦互联网用于商业和银行业,这就是一项重要的发展——从而引发了政府通过对加密软件实施出口管制并要求在加密软件上安装后门来控制该技术的尝试。通信系统和手机。如果没有公钥密码学,我们今天就不会如此激烈地争论普通公民的通信自由与政府保护这些公民免受敌对行为者侵害的需要之间的平衡。
And it started to put encryption tools into the hands of ordinary citizens—an essential development once the internet was used for commerce and banking—and thereby set off government attempts to control the technology by imposing export controls on encryption software and requiring back-doors to communications systems and cell phones. Without public key cryptography, we would not today be debating so intensely the balance between the freedom of communication of ordinary citizens and the need for governments to protect those citizens against hostile actors.
今天,我们正站在密码学革命的边缘。廉价数字硬件的发展使其摆脱了机械计算的设计限制,并将高级加密设备的成本降低到可以用于远程自动提款机和计算机终端等商业应用的水平。反过来,此类应用程序产生了对新型密码系统的需求,该系统最大限度地减少了安全密钥分发通道的必要性并提供了相当于书面签名的功能。与此同时,信息论和计算机科学的理论发展有望提供可证明安全的密码系统,将这一古老的艺术转变为一门科学。
WE stand today on the brink of a revolution in cryptography. The development of cheap digital hardware has freed it from the design limitations of mechanical computing and brought the cost of high grade cryptographic devices down to where they can be used in such commercial applications as remote cash dispensers and computer terminals. In turn, such applications create a need for new types of cryptographic systems which minimize the necessity of secure key distribution channels and supply the equivalent of a written signature. At the same time, theoretical developments in information theory and computer science show promise of providing provably secure cryptosystems, changing this ancient art into a science.
计算机控制的通信网络的发展使得世界两端的人或计算机之间能够轻松且廉价地进行联系,用电信取代了大多数邮件和许多短途旅行。对于许多应用程序来说,必须确保这些联系人的安全,防止窃听和非法消息注入。然而,目前安全问题的解决远远落后于通信技术的其他领域。当代密码学无法满足要求,因为它的使用会给系统用户带来严重的不便,从而消除了远程处理的许多好处。
The development of computer controlled communication networks proposes effortless and inexpensive contact between people or computers on opposite sides of the world, replacing most mail and many excursions with telecommunications. For many applications these contacts must be made secure against both eavesdropping and the injection of illegitimate messages. At present, however, the solution of security problems lags well behind other areas of communications technology. Contemporary cryptography is unable to meet the requirements, in that its use would impose such severe inconveniences on the system users, as to eliminate many of the benefits of teleprocessing.
最著名的加密问题是隐私问题:防止通过不安全的通道从通信中未经授权地提取信息。然而,为了使用密码术来确保隐私,目前通信双方需要共享一个不为其他人所知的密钥。这是通过提前通过一些安全渠道(例如私人快递或挂号邮件)发送密钥来完成的。然而,两个素不相识的人之间的私人对话在商业中是很常见的,期望最初的业务接触被推迟足够长的时间以便通过某种物理方式传输密钥是不现实的。这一关键分配问题带来的成本和延迟是业务通信传输到大型远程处理网络的主要障碍。
The best known cryptographic problem is that of privacy: preventing the unauthorized extraction of information from communications over an insecure channel. In order to use cryptography to insure privacy, however, it is currently necessary for the communicating parties to share a key which is known to no one else. This is done by sending the key in advance over some secure channel such as private courier or registered mail. A private conversation between two people with no prior acquaintance is a common occurrence in business, however, and it is unrealistic to expect initial business contacts to be postponed long enough for keys to be transmitted by some physical means. The cost and delay imposed by this key distribution problem is a major barrier to the transfer of business communications to large teleprocessing networks.
§ 42.3提出了两种通过公共(即不安全)通道传输密钥信息而不损害系统安全的方法。在公钥密码系统中,加密和解密由不同的密钥E和D控制,使得从E计算D在计算上是不可行的(例如,需要10100条指令)。因此可以公开公开加密密钥E而不损害解密密钥D。因此,网络的每个用户都可以将其加密密钥放在公共目录中。这使得系统的任何用户都可以将消息发送给任何其他用户,并以只有预期接收者才能解密的方式进行加密。因此,公钥密码系统是多路访问密码。因此,任何两个人之间都可以进行私人对话,无论他们以前是否曾经进行过交流。每个人都向对方发送用接收者的公共加密密钥加密的消息,并使用自己的秘密解密密钥解密他收到的消息。
§42.3 proposes two approaches to transmitting keying information over public (i.e., insecure) channels without compromising the security of the system. In a public key cryptosystem enciphering and deciphering are governed by distinct keys, E and D, such that computing D from E is computationally infeasible (e.g., requiring 10100 instructions). The enciphering key E can thus be publicly disclosed without compromising the deciphering key D. Each user of the network can, therefore, place his enciphering key in a public directory. This enables any user of the system to send a message to any other user enciphered in such a way that only the intended receiver is able to decipher it. As such, a public key cryptosystem is a multiple access cipher. A private conversation can therefore be held between any two individuals regardless of whether they have ever communicated before. Each one sends messages to the other enciphered in the receiver’s public enciphering key and deciphers the messages he receives using his own secret deciphering key.
我们提出了一些开发公钥密码系统的技术,但问题在很大程度上仍然悬而未决。
We propose some techniques for developing public key cryptosystems, but the problem is still largely open.
公钥分发系统提供了一种不同的方法来消除对安全密钥分发渠道的需求。在这样的系统中,希望交换密钥的两个用户来回通信,直到他们获得共同的密钥。窃听此交换的第三方一定会发现从偷听到的信息计算密钥在计算上是不可行的。第42.3节给出了公钥分发问题的可能解决方案,Merkle (1978) 有不同形式的部分解决方案。
Public key distribution systems offer a different approach to eliminating the need for a secure key distribution channel. In such a system, two users who wish to exchange a key communicate back and forth until they arrive at a key in common. A third party eavesdropping on this exchange must find it computationally infeasible to compute the key from the information overheard. A possible solution to the public key distribution problem is given in §42.3, and Merkle (1978) has a partial solution of a different form.
适合加密解决方案的第二个问题是身份验证,它阻碍了远程处理系统取代当代商业通信。在目前的业务中,合同的有效性是通过签名来保证的。签订的合同具有法律效力如有必要,持有人可以在法庭上出示的协议证据。然而,签名的使用需要传输和存储书面合同。为了对这种纸质工具进行纯粹的数字替代,每个用户都必须能够生成一条消息,其真实性可以由任何人检查,但其他任何人(甚至接收者)都无法生成。由于只有一个人可以发出消息,但许多人可以接收消息,因此这可以视为一种广播密码。目前的电子认证技术无法满足这一需求。
A second problem, amenable to cryptographic solution, which stands in the way of replacing contemporary business communications by teleprocessing systems is authentication. In current business, the validity of contracts is guaranteed by signatures. A signed contract serves as legal evidence of an agreement which the holder can present in court if necessary. The use of signatures, however, requires the transmission and storage of written contracts. In order to have a purely digital replacement for this paper instrument, each user must be able to produce a message whose authenticity can be checked by anyone, but which could not have been produced by anyone else, even the recipient. Since only one person can originate messages but many people can receive messages, this can be viewed as a broadcast cipher. Current electronic authentication techniques cannot meet this need.
§ 42.4讨论了提供真实的、数字的、消息相关的签名的问题。由于那里提出的原因,我们将此称为单向身份验证问题。给出了一些部分解决方案,并展示了如何将任何公钥密码系统转换为单向认证系统。
§42.4 discusses the problem of providing a true, digital, message dependent signature. For reasons brought out there, we refer to this as the one-way authentication problem. Some partial solutions are given, and it is shown how any public key cryptosystem can be transformed into a one-way authentication system.
§ 42.5将考虑各种密码问题的相互关系,并引入更困难的活板门问题。
§42.5 will consider the interrelation of various cryptographic problems and introduce the even more difficult problem of trap doors.
与此同时,通信和计算引发了新的密码问题,其后代信息论和计算理论已经开始为解决经典密码学中的重要问题提供工具。
At the same time that communications and computation have given rise to new cryptographic problems, their offspring, information theory, and the theory of computation have begun to supply tools for the solution of important problems in classical cryptography.
寻找牢不可破的代码是密码学研究最古老的主题之一,但直到本世纪,所有提出的系统最终都被破解了。然而,在 20 世纪 20 年代,“一次性本”被发明,并且被证明是牢不可破的(Kahn,1967,第 398-400 页)。四分之一个世纪后,信息论为这个系统和相关系统奠定了坚实的基础(Shannon,1949)。一次性一键密码本需要极长的密钥,因此在大多数应用中价格昂贵得令人望而却步。
The search for unbreakable codes is one of the oldest themes of cryptographic research, but until this century all proposed systems have ultimately been broken. In the nineteen twenties, however, the “one time pad” was invented, and shown to be unbreakable (Kahn, 1967, pp. 398–400). The theoretical basis underlying this and related systems was put on a firm foundation a quarter century later by information theory (Shannon, 1949). One time pads require extremely long keys and are therefore prohibitively expensive in most applications.
相比之下,大多数密码系统的安全性在于密码分析者在不知道密钥的情况下发现明文的计算难度。这个问题属于计算复杂性和算法分析领域,这是研究解决计算问题难度的两个最新学科。利用这些理论的结果,在可预见的将来,有可能将安全性证明扩展到更有用的系统类别。§ 42.6探讨了这种可能性。
In contrast, the security of most cryptographic systems resides in the computational difficulty to the cryptanalyst of discovering the plaintext without knowledge of the key. This problem falls within the domains of computational complexity and analysis of algorithms, two recent disciplines which study the difficulty of solving computational problems. Using the results of these theories, it may be possible to extend proofs of security to more useful classes of systems in the foreseeable future. §42.6 explores this possibility.
在进行新的开发之前,我们将在下一节中介绍术语并定义威胁环境。
Before proceeding to newer developments, we introduce terminology and define threat environments in the next section.
密码学是对“数学”系统的研究,用于解决两种安全问题:隐私和身份验证。隐私系统可防止未经授权的各方从通过公共渠道传输的消息中提取信息,从而确保消息的发送者只能由预期的接收者读取该消息。身份验证系统可防止未经授权的消息注入公共通道,从而确保消息接收方其发送方的合法性。
Cryptography is the study of “mathematical” systems for solving two kinds of security problems: privacy and authentication. A privacy system prevents the extraction by unauthorized parties from messages transmitted over a public channel, thus assuring the sender of a message that it is being read only by the intended recipient. An authentication system prevents the unauthorized injection of messages into a public channel, assuring the receiver of a message of the legitimacy of its sender.
如果某个频道的安全性不足以满足用户的需求,则该频道被视为公共频道。因此,诸如电话线之类的信道可能被一些用户认为是私有的,而被其他用户认为是公共的。任何通道都可能受到窃听或注入或两者兼而有之的威胁,具体取决于其用途。在电话通信中,注入威胁至关重要,因为被叫方无法确定正在呼叫的电话。窃听需要使用窃听器,这在技术上更加困难,而且在法律上也更危险。相比之下,广播领域的情况则相反。窃听是被动的,不涉及法律风险,而注入则会使非法发射器被发现和起诉。
A channel is considered public if its security is inadequate for the needs of its users. A channel such as a telephone line may therefore be considered private by some users and public by others. Any channel may be threatened with eavesdropping or injection or both, depending on its use. In telephone communication, the threat of injection is paramount, since the called party cannot determine which phone is calling. Eavesdropping, which requires the use of a wiretap, is technically more difficult and legally hazardous. In radio, by comparison, the situation is reversed. Eavesdropping is passive and involves no legal hazard, while injection exposes the illegitimate transmitter to discovery and prosecution.
将我们的问题分为隐私和身份验证问题后,我们有时会将身份验证进一步细分为消息身份验证(这是上面定义的问题)和用户身份验证(其中系统的唯一任务是验证个人是否是他所声称的人)是。例如,必须验证出示信用卡的个人的身份,但没有他希望传输的消息。尽管用户认证中明显缺少消息,但这两个问题在很大程度上是相同的。在用户认证中,有一条隐式消息“I AM USER X”,而消息认证只是验证发送消息的一方的身份。然而,这两个子问题的威胁环境和其他方面的差异有时可以方便地区分它们。
Having divided our problems into those of privacy and authentication we will sometimes further subdivide authentication into message authentication, which is the problem defined above, and user authentication, in which the only task of the system is to verify that an individual is who he claims to be. For example, the identity of an individual who presents a credit card must be verified, but there is no message which he wishes to transmit. In spite of this apparent absence of a message in user authentication, the two problems are largely equivalent. In user authentication, there is an implicit message “I AM USER X,” while message authentication is just verification of the identity of the party sending the message. Differences in the threat environments and other aspects of these two subproblems, however, sometimes make it convenient to distinguish between them.
图 42.1说明了用于通信隐私的传统密码系统中的信息流。共有三方:发送者、接收者和窃听者。发送器生成明文或未加密的消息P,通过不安全的通道传送给合法接收器。为了防止窃听者得知P,发射机对P进行可逆变换S K进行操作,以产生密文或密文C = S K ( P )。密钥K仅通过安全通道传输到合法接收者,如图 42.1中的屏蔽路径所示。由于合法接收者知道K ,因此他可以通过 操作来解密C以获得
原始明文消息 。由于容量或延迟的原因,安全信道不能用于传输P本身。例如,安全通道可能是每周一次的快递,不安全通道可能是电话线。
Figure 42.1 illustrates the flow of information in a conventional cryptographic system used for privacy of communications. There are three parties: a transmitter, a receiver, and an eavesdropper. The transmitter generates a plaintext or unenciphered message P to be communicated over an insecure channel to the legitimate receiver. In order to prevent the eavesdropper from learning P, the transmitter operates on P with an invertible transformation SK to produce the ciphertext or cryptogram C = SK(P). The key K is transmitted only to the legitimate receiver via a secure channel, indicated by a shielded path in Figure 42.1. Since the legitimate receiver knows K, he can decipher C by operating with to obtain , the original plaintext message. The secure channel cannot be used to transmit P itself for reasons of capacity or delay. For example, the secure channel might be a weekly courier and the insecure channel a telephone line.
图 42.1: 传统密码系统中的信息流
Figure 42.1: Flow of information in conventional cryptographic system
密码系统是从明文消息空间 { P }到密文消息空间 { C }的可逆变换S K : { P }→ { C } 的单参数族 { S K } K ∈{ K }。参数K称为密钥,是从称为密钥空间的有限集 { K } 中选择的。如果消息空间 { P } 和 { C } 相等,我们将用 { M } 来表示它们。讨论时由于单个密码变换S K,我们有时会省略系统的提及,而仅提及变换K。
A cryptographic system is a single parameter family {SK}K∈{K} of invertible transformations SK: {P}→{C} from a space {P} of plaintext messages to a space {C} of ciphertext messages. The parameter K is called the key and is selected from a finite set {K} called the keyspace. If the message spaces {P} and {C} are equal, we will denote them both by {M}. When discussing individual cryptographic transformations SK, we will sometimes omit mention of the system and merely refer to the transformation K.
设计密码系统{ SK }的目标是使加密和解密操作成本低廉,但要确保任何成功的密码分析操作都过于复杂而不经济。解决这个问题有两种方法。由于密码分析的计算成本而安全,但会屈服于无限计算的攻击的系统被称为计算安全;而无论允许多少计算,都能抵抗任何密码分析攻击的系统被称为无条件安全。Shannon (1949) 和 Hellman (1977) 讨论了无条件安全系统,它属于信息论的一部分,称为香农理论,它涉及通过无限计算获得的最佳性能。
The goal in designing the cryptosystem {SK} is to make the enciphering and deciphering operations inexpensive, but to ensure that any successful cryptanalytic operation is too complex to be economical. There are two approaches to this problem. A system which is secure due to the computational cost of cryptanalysis, but which would succumb to an attack with unlimited computation, is called computationally secure; while a system which can resist any cryptanalytic attack, no matter how much computation is allowed, is called unconditionally secure. Unconditionally secure systems are discussed in Shannon (1949) and Hellman (1977) and belong to that portion of information theory, called the Shannon theory, which is concerned with optimal performance obtainable with unlimited computation.
无条件的安全性源于密码存在多种有意义的解决方案。例如,由英文文本产生的简单替换密码 XMD 可以表示明文消息:now、and、the 等。相比之下,计算安全密码包含足够的信息来唯一确定明文和密钥。它的安全性仅取决于计算它们的成本。
Unconditional security results from the existence of multiple meaningful solutions to a cryptogram. For example, the simple substitution cryptogram XMD resulting from English text can represent the plaintext messages: now, and, the, etc. A computationally secure cryptogram, in contrast, contains sufficient information to uniquely determine the plaintext and the key. Its security resides solely in the cost of computing them.
唯一常用的无条件安全系统是一次性密码本,其中明文与随机选择的相同长度的密钥相结合。虽然这样的系统已被证明是安全的,但所需的大量密钥使其对于大多数应用程序来说不切实际。除非另有说明,本文讨论的是计算安全系统,因为这些系统更普遍适用。当我们谈论开发可证明安全的密码系统的必要性时,我们排除了那些难以使用的系统,例如一次性一垫本。相反,我们想到的系统仅使用几百位密钥,并且可以在少量数字硬件或几百行软件中实现。
The only unconditionally secure system in common use is the one time pad, in which the plaintext is combined with a randomly chosen key of the same length. While such a system is provably secure, the large amount of key required makes it impractical for most applications. Except as otherwise noted, this paper deals with computationally secure systems since these are more generally applicable. When we talk about the need to develop provably secure cryptosystems we exclude those, such as the one time pad, which are unwieldy to use. Rather, we have in mind systems using only a few hundred bits of key and implementable in either a small amount of digital hardware or a few hundred lines of software.
如果一个任务的成本(以使用的内存量或运行时间来衡量)是有限的但又大得不可能,我们将称该任务在计算上不可行。
We will call a task computationally infeasible if its cost as measured by either the amount of memory used or the runtime is finite but impossibly large.
就像纠错码分为卷积码和分组码一样,密码系统也可以分为两大类:流密码和分组密码。流密码以小块(位或字符)的形式处理明文,通常会生成伪随机位序列,该序列以模 2 与明文的位相加。分组密码以纯粹组合的方式作用于大文本块,这样输入块中的微小变化就会产生结果输出的重大变化。本文主要讨论分组密码,因为这种错误传播特性在许多身份验证应用中都很有价值。
Much as error correcting codes are divided into convolutional and block codes, cryptographic systems can be divided into two broad classes: stream ciphers and block ciphers. Stream ciphers process the plaintext in small chunks (bits or characters), usually producing a pseudorandom sequence of bits which is added modulo 2 to the bits of the plaintext. Block ciphers act in a purely combinatorial fashion on large blocks of text, in such a way that a small change in the input block produces a major change in the resulting output. This paper deals primarily with block ciphers, because this error propagation property is valuable in many authentication applications.
在身份验证系统中,密码学用于保证接收者收到的消息的真实性。不仅必须防止干预者将全新的、看起来真实的消息注入通道中,而且还必须防止他通过组合或仅仅重复他过去复制的旧消息来创建明显真实的消息。一般来说,旨在保证隐私的加密系统不会阻止后一种形式的恶作剧。
In an authentication system, cryptography is used to guarantee the authenticity of the message to the receiver. Not only must a meddler be prevented from injecting totally new, authentic looking messages into a channel, but he must be prevented from creating apparently authentic messages by combining, or merely repeating, old messages which he has copied in the past. A cryptographic system intended to guarantee privacy will not, in general, prevent this latter form of mischief.
为了保证消息的真实性,添加了信息,该信息不仅是消息和密钥的函数,而且也是日期和时间的函数;例如,将日期和时间附加到每条消息并对整个序列进行加密。这确保了只有拥有密钥的人才能生成一条消息,该消息在解密后将包含正确的日期和时间。然而,必须小心使用密文的微小变化会导致解密的明文发生较大变化的系统。这种故意的错误传播确保,如果在通道上故意注入噪声,将诸如“擦除文件 7”之类的消息更改为诸如“擦除文件 8”之类的不同消息,它也会破坏身份验证信息。然后该消息将因不真实而被拒绝。
To guarantee the authenticity of a message, information is added which is a function not only of the message and a secret key, but of the date and time as well; for example, by attaching the date and time to each message and encrypting the entire sequence. This assures that only someone who possesses the key can generate a message which, when decrypted, will contain the proper date and time. Care must be taken, however, to use a system in which small changes in the ciphertext result in large changes in the deciphered plaintext. This intentional error propagation ensures that if the deliberate injection of noise on the channel changes a message such as “erase file 7” into a different message such as “erase file 8,” it will also corrupt the authentication information. The message will then be rejected as inauthentic.
评估密码系统充分性的第一步是对其面临的威胁进行分类。用于隐私或认证的密码系统可能会出现以下威胁。
The first step in assessing the adequacy of cryptographic systems is to classify the threats to which they are to be subjected. The following threats may occur to cryptographic systems employed for either privacy or authentication.
仅密文攻击是一种密码分析攻击,其中密码分析者仅拥有密文。
A ciphertext only attack is a cryptanalytic attack in which the cryptanalyst possesses only ciphertext.
已知的明文攻击是密码分析攻击,其中密码分析者拥有大量相应的明文和密文。
A known plaintext attack is a cryptanalytic attack in which the cryptanalyst possesses a substantial quantity of corresponding plaintext and ciphertext.
选择明文攻击是一种密码分析攻击,其中密码分析者可以提交无限数量的他自己选择的明文消息并检查生成的密码。
A chosen plaintext attack is a cryptanalytic attack in which the cryptanalyst can submit an unlimited number of plaintext messages of his own choosing and examine the resulting cryptograms.
在所有情况下,都假设对手知道所使用的通用系统{ SK },因为该信息可以通过研究密码设备来获得。[编辑:这个假设被称为 Kerckhoffs 原理,可以追溯到 1883 年,尽管它常常被不明智地忽视。] 虽然许多密码学用户试图对其设备保密,但许多商业应用不仅要求通用系统公开,而且要求公开系统。让它成为标准。
In all cases it is assumed that the opponent knows the general system {SK} in use since this information can be obtained by studying a cryptographic device. [EDITOR: This assumption is known as Kerckhoffs’s principle and dates to 1883, though it has often unwisely been ignored.] While many users of cryptography attempt to keep their equipment secret, many commercial applications require not only that the general system be public but that it be standard.
纯密文攻击在实践中经常发生。密码分析者仅使用所用语言的统计特性的知识(例如,在英语中,字母 e 出现的时间为 13%)和某些“可能”单词的知识(例如,一个字母可能以“亲爱的先生:”开头) )。它是系统可能遭受的最弱的威胁,任何屈服于它的系统都被认为是完全不安全的。
A ciphertext only attack occurs frequently in practice. The cryptanalyst uses only knowledge of the statistical properties of the language in use (e.g., in English, the letter e occurs 13 percent of the time) and knowledge of certain “probable” words (e.g., a letter probably begins “Dear Sir:”). It is the weakest threat to which a system can be subjected, and any system which succumbs to it is considered totally insecure.
能够抵御已知明文攻击的系统使用户无需保密其过去的消息,也无需在解密之前对其进行解释。这对系统用户来说是不合理的负担,特别是在产品公告或新闻稿可能以加密形式发送以供以后公开披露的商业情况下。外交信函中的类似情况已导致许多本应安全的系统被破解。虽然已知的明文攻击并不总是可能发生,但它的发生频率足够高,以至于无法抵抗它的系统不被认为是安全的。
A system which is secure against a known plaintext attack frees its users from the need to keep their past messages secret, or to paraphrase them prior to declassification. This is an unreasonable burden to place on the system’s users, particularly in commercial situations where product announcements or press releases may be sent in encrypted form for later public disclosure. Similar situations in diplomatic correspondence have led to the cracking of many supposedly secure systems. While a known plaintext attack is not always possible, its occurrence is frequent enough that a system which cannot resist it is not considered secure.
选择明文攻击在实践中很难实现,但可以近似。例如,向竞争对手提交提案可能会导致他对其进行加密以便传输到他的总部。因此,能够抵御选定明文攻击的密码可以使用户不必担心对手是否可以在他们的系统中植入消息。
A chosen plaintext attack is difficult to achieve in practice, but can be approximated. For example, submitting a proposal to a competitor may result in his enciphering it for transmission to his headquarters. A cipher which is secure against a chosen plaintext attack thus frees its users from concern over whether their opponents can plant messages in their system.
为了证明系统安全,考虑更强大的密码分析威胁是适当的,因为这些威胁不仅提供了密码系统工作环境的更真实的模型,而且使系统强度的评估变得更容易。许多难以使用仅密文攻击进行分析的系统可以在已知明文或选定明文攻击下立即排除。
For the purpose of certifying systems as secure, it is appropriate to consider the more formidable cryptanalytic threats as these not only give more realistic models of the working environment of a cryptographic system, but make the assessment of the system’s strength easier. Many systems which are difficult to analyze using a ciphertext only attack can be ruled out immediately under known plaintext or chosen plaintext attacks.
从这些定义可以清楚地看出,密码分析是一个系统识别问题。已知明文和选择明文攻击分别对应于被动和主动系统识别问题。与许多考虑系统识别的学科(例如自动故障诊断)不同,密码学的目标是构建难以识别的系统,而不是容易识别的系统。
As is clear from these definitions, cryptanalysis is a system identification problem. The known plaintext and chosen plaintext attacks correspond to passive and active system identification problems, respectively. Unlike many subjects in which system identification is considered, such as automatic fault diagnosis, the goal in cryptography is to build systems which are difficult, rather than easy, to identify.
选择的明文攻击通常称为 IFF 攻击,该术语源自二战后加密“识别朋友或敌人”系统的发展。敌我识别系统使军用雷达能够自动区分友机和敌机。雷达向飞机发送时变质询,飞机接收质询,用适当的密钥对其进行加密,然后将其发送回雷达。通过将此响应与正确加密的挑战版本进行比较,雷达可以识别友方飞机。当飞机飞越敌方领土时,敌方密码分析人员可以发送质询并检查加密响应,试图确定正在使用的身份验证密钥,从而对系统发起选定的明文攻击。在实践中,这种威胁是通过限制挑战的形式来应对的,挑战的形式不必是不可预测的,而只是不重复的。
The chosen plaintext attack is often called an IFF attack, terminology which descends from its origin in the development of cryptographic “identification friend or foe” systems after World War II. An IFF system enables military radars to distinguish between friendly and enemy planes automatically. The radar sends a time-varying challenge to the airplane which receives the challenge, encrypts it under the appropriate key, and sends it back to the radar. By comparing this response with a correctly encrypted version of the challenge, the radar can recognize a friendly aircraft. While the aircraft are over enemy territory, enemy cryptanalysts can send challenges and examine the encrypted responses in an attempt to determine the authentication key in use, thus mounting a chosen plaintext attack on the system. In practice, this threat is countered by restricting the form of the challenges, which need not be unpredictable, but only nonrepeating.
身份验证系统还存在其他威胁,这些威胁无法通过传统密码学来处理,需要借助本文中介绍的新思想和技术。接收者身份验证数据的泄露威胁是由多用户网络中的情况引起的,其中接收者通常是系统本身。接收者的密码表和其他认证数据比发送者(个人用户)的密码表和其他认证数据更容易被盗。如稍后所示,一些用于防范这种威胁的技术也可以防范争议的威胁。也就是说,消息可能已发送但随后被发送器或接收器拒绝。或者,任何一方都可能声称已发送消息,但实际上没有发送消息。需要不可伪造的数字签名和收据。例如,不诚实的股票经纪人可能会试图通过伪造客户订单来掩盖未经授权的买卖以谋取个人利益,或者客户可能会否认其实际授权的订单,但他后来发现这会造成损失。我们将引入一些概念,允许接收者验证消息的真实性,但阻止他生成明显真实的消息,从而防止接收者的身份验证数据泄露的威胁和争议的威胁。
There are other threats to authentication systems which cannot be treated by conventional cryptography, and which require recourse to the new ideas and techniques introduced in this paper. The threat of compromise of the receiver’s authentication data is motivated by the situation in multiuser networks where the receiver is often the system itself. The receiver’s password tables and other authentication data are then more vulnerable to theft than those of the transmitter (an individual user). As shown later, some techniques for protecting against this threat also protect against the threat of dispute. That is, a message may be sent but later repudiated by either the transmitter or the receiver. Or, it may be alleged by either party that a message was sent when in fact none was. Unforgeable digital signatures and receipts are needed. For example, a dishonest stockbroker might try to cover up unauthorized buying and selling for personal gain by forging orders from clients, or a client might disclaim an order actually authorized by him but which he later sees will cause a loss. We will introduce concepts which allow the receiver to verify the authenticity of a message, but prevent him from generating apparently authentic messages, thereby protecting against both the threat of compromise of the receiver’s authentication data and the threat of dispute.
如图42.1所示,密码学已经成为一种衍生的安全措施。一旦存在可以传输密钥的安全通道,就可以通过加密在其上发送的消息来将安全性扩展到更高带宽或更小延迟的其他通道。其效果是将密码学的使用限制在预先为密码安全性做好准备的人们之间的通信中。
As shown in Figure 42.1, cryptography has been a derivative security measure. Once a secure channel exists along which keys can be transmitted, the security can be extended to other channels of higher bandwidth or smaller delay by encrypting the messages sent on them. The effect has been to limit the use of cryptography to communications among people who have made prior preparation for cryptographic security.
为了开发大型、安全的电信系统,必须改变这一点。大量的用户n会产生更大的数量,即 ( n 2 − n )/2 可能希望与所有其他人进行私下通信的潜在对。假设一对之前不认识的用户能够等待通过某种安全物理方式发送密钥,或者所有 ( n 2 − n )/2 对的密钥可以排列在进步。在另一篇论文(Diffie 和 Hellman,1976b)中,作者考虑了一种保守的方法,不需要密码学本身的新发展,但这涉及到安全性降低、不便以及网络对初始连接协议的星型配置的限制。
In order to develop large, secure, telecommunications systems, this must be changed. A large number of users n results in an even larger number, (n2 − n)/2 potential pairs who may wish to communicate privately from all others. It is unrealistic to assume either that a pair of users with no prior acquaintance will be able to wait for a key to be sent by some secure physical means, or that keys for all (n2 − n)/2 pairs can be arranged in advance. In another paper (Diffie and Hellman, 1976b), the authors have considered a conservative approach requiring no new development in cryptography itself, but this involves diminished security, inconvenience, and restriction of the network to a starlike configuration with respect to initial connection protocol.
我们建议可以开发如图 42.2所示类型的系统,其中两方仅通过公共通道进行通信并且仅使用公开已知的技术可以创建安全连接。我们研究了解决这个问题的两种方法,分别称为公钥密码系统和公钥分发系统。第一个更强大,有助于解决下一节中处理的身份验证问题,而第二个则更接近实现。
We propose that it is possible to develop systems of the type shown in Figure 42.2, in which two parties communicating solely over a public channel and using only publicly known techniques can create a secure connection. We examine two approaches to this problem, called public key cryptosystems and public key distribution systems, respectively. The first are more powerful, lending themselves to the solution of the authentication problems treated in the next section, while the second are much closer to realization.
图42.2: 公钥系统中的信息流
Figure 42.2: Flow of information in public key system
公钥密码系统是一对表示可逆变换的算法族 { E K } K ∈{ K }和 { D K } K ∈{ K } ,
A public key cryptosystem is a pair of families {EK}K∈{K} and {DK}K∈{K} of algorithms representing invertible transformations,
在有限消息空间 { M } 上,使得
on a finite message space {M}, such that
1. 对于每个K ∈{ K },E K是D K的倒数,
1. for every K ∈{K}, EK is the inverse of DK,
2. 对于每个K ∈{ K } 和M ∈{ M },算法E K和D K很容易计算,
2. for every K ∈{K} and M ∈{M}, the algorithms EK and DK are easy to compute,
3. 对于几乎每个K ∈{ K },每个与D K等价的容易计算的算法在计算上都不可行从E K导出,
3. for almost every K ∈{K}, each easily computed algorithm equivalent to DK is computationally infeasible to derive from EK,
4. 对于每个K ∈ { K },从K计算逆对E K和D K是可行的。
4. for every K ∈ {K}, it is feasible to compute inverse pairs EK and DK from K.
由于第三个特性,用户的加密密钥E K可以被公开,而不会损害其秘密解密密钥D K的安全性。密码系统因此被分裂分为两部分,一个加密变换家族和一个解密变换家族,以这样的方式,给定一个家族的成员,不可能找到另一个家族的相应成员。
Because of the third property, a user’s enciphering key EK can be made public without compromising the security of his secret deciphering key DK. The cryptographic system is therefore split into two parts, a family of enciphering transformations and a family of deciphering transformations in such a way that, given a member of one family, it is infeasible to find the corresponding member of the other.
第四个属性保证,当对加密或解密变换没有限制时,有一种可行的方法来计算相应的逆变换对。实际上,加密设备必须包含用于生成K的真随机数生成器(例如,噪声二极管) ,以及用于从其输出生成E K – D K对的算法。
The fourth property guarantees that there is a feasible way of computing corresponding pairs of inverse transformations when no constraint is placed on what either the enciphering or deciphering transformation is to be. In practice, the cryptoequipment must contain a true random number generator (e.g., a noisy diode) for generating K, together with an algorithm for generating the EK–DK pair from its outputs.
有了这种系统,密钥分发问题就大大简化了。每个用户在他的终端生成一对逆变换E和D。解密变换D必须保密,但不需要在任何渠道上传达。可以通过将加密密钥E与用户的姓名和地址一起放置在公共目录中来公开它。然后任何人都可以加密消息并将其发送给用户,但没有其他人可以破译发给他的消息。因此,公钥密码系统可以被视为多路访问密码。
Given a system of this kind, the problem of key distribution is vastly simplified. Each user generates a pair of inverse transformations, E and D, at his terminal. The deciphering transformation D must be kept secret, but need never be communicated on any channel. The enciphering key E can be made public by placing it in a public directory along with the user’s name and address. Anyone can then encrypt messages and send them to the user, but no one else can decipher messages intended for him. Public key cryptosystems can thus be regarded as multiple access ciphers.
保护加密密钥的公共文件免遭未经授权的修改至关重要。文件的公共性质使这项任务变得更加容易。读保护是不必要的,并且由于文件很少被修改,因此可以经济地采用复杂的写保护机制。
It is crucial that the public file of enciphering keys be protected from unauthorized modification. This task is made easier by the public nature of the file. Read protection is unnecessary and, since the file is modified infrequently, elaborate write protection mechanisms can be economically employed.
一个有启发性但不幸无用的公钥密码系统示例是通过将表示为二进制n向量m的明文乘以可逆二进制n × n矩阵E来对其进行加密。因此密码等于E m。令D = E −1我们有m = D c。因此,加密和解密都需要大约n 2 次操作。然而,从E计算D涉及矩阵求逆,这是一个更难的问题。获得任意一对逆矩阵至少在概念上比求逆给定矩阵更简单。从单位矩阵I开始,进行初等行和列运算以获得任意可逆矩阵E。然后从I开始,以相反的顺序对这些相同的初等运算进行逆运算,以获得D = E −1。基本操作的序列可以很容易地从随机位串中确定。
A suggestive, although unfortunately useless, example of a public key cryptosystem is to encipher the plaintext, represented as a binary n-vector m, by multiplying it by an invertible binary n × n matrix E. The cryptogram thus equals Em. Letting D = E−1 we have m = Dc. Thus, both enciphering and deciphering require about n2 operations. Calculation of D from E, however, involves a matrix inversion which is a harder problem. And it is at least conceptually simpler to obtain an arbitrary pair of inverse matrices than it is to invert a given matrix. Start with the identity matrix I and do elementary row and column operations to obtain an arbitrary invertible matrix E. Then starting with I do the inverses of these same elementary operations in reverse order to obtain D = E−1. The sequence of elementary operations could be easily determined from a random bit string.
不幸的是,矩阵求逆只需要大约n 3运算。因此,“密码分析”时间(即,从E计算D)与加密或解密时间的比率最多为n,并且需要巨大的块大小才能获得 10 6或更大的比率。此外,用于从I获取E的基本运算知识似乎并没有大大减少计算D的时间。并且,由于二进制算术中不存在舍入误差,因此数值稳定性在矩阵求逆中并不重要。尽管缺乏实用性,这个矩阵示例对于阐明公钥密码系统中必要的关系仍然有用。
Unfortunately, matrix inversion takes only about n3 operations. The ratio of “cryptanalytic” time (i.e., computing D from E) to enciphering or deciphering time is thus at most n, and enormous block sizes would be required to obtain ratios of 106 or greater. Also, it does not appear that knowledge of the elementary operations used to obtain E from I greatly reduces the time for computing D. And, since there is no round-off error in binary arithmetic, numerical stability is unimportant in the matrix inversion. In spite of its lack of practical utility, this matrix example is still useful for clarifying the relationships necessary in a public key cryptosystem.
一种更实用的方法来寻找一对易于计算的逆算法E和D;这样D很难从E推断出来,利用了低级语言分析程序的困难。任何试图确定某人完成了什么操作的人else 的机器语言程序知道E本身(即E 的作用)很难从E的算法中推断出来。如果通过添加不需要的变量和语句来故意使程序变得混乱,那么确定逆算法可能会变得非常困难。当然,E必须足够复杂,以防止从输入输出对中识别出它。
A more practical approach to finding a pair of easily computed inverse algorithms E and D; such that D is hard to infer from E, makes use of the difficulty of analyzing programs in low level languages. Anyone who has tried to determine what operation is accomplished by someone else’s machine language program knows that E itself (i.e., what E does) can be hard to infer from an algorithm for E. If the program were to be made purposefully confusing through addition of unneeded variables and statements, then determining an inverse algorithm could be made very difficult. Of course, E must be complicated enough to prevent its identification from input-output pairs.
本质上需要的是一种单向编译器:一种将用高级语言编写的易于理解的程序转换为某种机器语言的难以理解的程序的编译器。编译器是单向的,因为编译必须是可行的,但逆向过程是不可行的。由于程序大小和运行时间的效率在此应用中并不重要,因此如果可以优化机器语言的结构以帮助消除混乱,则这种编译器可能是可能的。
Essentially what is required is a one-way compiler: one which takes an easily understood program written in a high level language and translates it into an incomprehensible program in some machine language. The compiler is one-way because it must be feasible to do the compilation, but infeasible to reverse the process. Since efficiency in size of program and run time are not crucial in this application, such compilers may be possible if the structure of the machine language can be optimized to assist in the confusion.
Merkle (1978) 独立研究了通过不安全通道分发密钥的问题。他的方法与上面建议的公钥密码系统的方法不同,并且将被称为公钥分发系统。目标是两个用户A和B通过不安全的通道安全地交换密钥。然后,普通密码系统中的两个用户都使用该密钥进行加密和解密。Merkle 有一个解决方案,其密码分析成本随着n 2增长,其中n是合法用户的成本。不幸的是,系统合法用户的传输时间成本与计算成本一样多,因为 Merkle 协议要求在决定一个密钥之前传输n 个潜在密钥。Merkle 指出,这种高传输开销使系统无法在实践中发挥很大作用。如果对设置协议的开销设置 1 兆比特的限制,他的技术可以实现大约 10,000 比 1 的成本比,这对于大多数应用来说太小了。如果廉价、高带宽的数据链路变得可用,则可以实现一百万比一或更大的比率,并且该系统将具有巨大的实用价值。
Merkle (1978) has independently studied the problem of distributing keys over an insecure channel. His approach is different from that of the public key cryptosystems suggested above, and will be termed a public key distribution system. The goal is for two users, A and B, to securely exchange a key over an insecure channel. This key is then used by both users in a normal cryptosystem for both enciphering and deciphering. Merkle has a solution whose cryptanalytic cost grows as n2 where n is the cost to the legitimate users. Unfortunately the cost to the legitimate users of the system is as much in transmission time as in computation, because Merkle’s protocol requires n potential keys to be transmitted before one key can be decided on. Merkle notes that this high transmission overhead prevents the system from being very useful in practice. If a one megabit limit is placed on the setup protocol’s overhead, his technique can achieve cost ratios of approximately 10,000 to 1, which are too small for most applications. If inexpensive, high bandwidth data links become available, ratios of a million to one or greater could be achieved and the system would be of substantial practical value.
我们现在建议一种新的公钥分发系统,它有几个优点。首先,它只需要交换一个“密钥”。其次,合法用户的密码分析工作似乎呈指数级增长。第三,它的使用可以与用户信息的公共文件相关联,该文件用于向用户B验证用户A,反之亦然。通过使公共文件本质上成为只读存储器,一个人的外表允许用户向许多用户多次验证他的身份。Merkle的技术要求A和B通过其他方式验证彼此的身份。
We now suggest a new public key distribution system which has several advantages. First, it requires only one “key” to be exchanged. Second, the cryptanalytic effort appears to grow exponentially in the effort of the legitimate users. And, third, its use can be tied to a public file of user information which serves to authenticate user A to user B and vice versa. By making the public file essentially a read only memory, one personal appearance allows a user to authenticate his identity many times to many users. Merkle’s technique requires A and B to verify each other’s identities through other means.
新技术利用了在具有素数q元素的有限域GF ( q ) 上计算对数的明显困难。让
The new technique makes use of the apparent difficulty of computing logarithms over a finite field GF(q) with a prime number q of elements. Let
其中α是GF ( q )的固定本原元素,则X称为Y以α为底的对数,mod q:
where α is a fixed primitive element of GF(q), then X is referred to as the logarithm of Y to the base α, mod q:
从X计算Y很容易,最多需要 2log 2 q乘法(Knuth,1969,第 398-400 页)。例如,对于X = 18,
Calculation of Y from X is easy, taking at most 2log2 q multiplications (Knuth, 1969, pp. 398–400). For example, for X = 18,
另一方面,从Y计算X可能要困难得多,并且对于某些精心选择的q值,需要使用最著名的算法进行q 1/2运算(Pohlig 和 Hellman,1978 年;Knuth,1973 年) ,第 9 页,575–576)。
Computing X from Y, on the other hand can be much more difficult and, for certain carefully chosen values of q, requires on the order of q1/2 operations, using the best known algorithm (Pohlig and Hellman, 1978; Knuth, 1973, pp. 9, 575–576).
我们技术的安全性很大程度上取决于计算对数 mod q的难度,如果找到一个复杂性随着 log 2 q增长的算法,我们的系统就会崩溃。虽然问题陈述的简单性可能允许如此简单的算法,但它可能反而允许证明问题的难度。现在我们假设计算 log mod q的最著名算法实际上接近最优,因此对于正确选择的q来说, q 1/2是问题复杂性的良好衡量标准。
The security of our technique depends crucially on the difficulty of computing logarithms mod q, and if an algorithm whose complexity grew as log2 q were to be found, our system would be broken. While the simplicity of the problem statement might allow such simple algorithms, it might instead allow a proof of the problem’s difficulty. For now we assume that the best known algorithm for computing logs mod q is in fact close to optimal and hence that q1/2 is a good measure of the problem’s complexity, for a properly chosen q.
每个用户生成一个从整数集合 {1, 2 , … , q − 1} 中统一选择的独立随机数X i。每个人都对X i保密,但将Y i = α X i mod q与他的姓名和地址一起放入公共文件中。当用户i和j希望进行私密通信时,他们使用K ij = α X i X j mod q作为密钥。用户i通过从公共文件中获取Y j并让
Each user generates an independent random number Xi chosen uniformly from the set of integers {1, 2, …, q − 1}. Each keeps Xi secret, but places Yi = αXi mod q in a public file with his name and address. When users i and j wish to communicate privately, they use Kij = αXiXj mod q as their key. User i obtains Kij by obtaining Yj from the public file and letting
用户j以类似的方式mod q获得K ij。另一个用户必须根据Y i和Y j计算K ij,例如通过计算mod q 。
User j obtains Kij in the similar fashion, mod q. Another user must compute Kij from Yi and Yj, for example, by computing mod q.
因此我们看到,如果 log mod q很容易计算,系统就可能被破坏。虽然我们目前没有相反的证明(即,如果 log mod q难以计算,则系统是安全的),但我们也没有看到任何方法可以从Y i和Y j计算K ij而无需首先获得X i或X j。
We thus see that if logs mod q are easily computed the system can be broken. While we do not currently have a proof of the converse (i.e., that the system is secure if logs mod q are difficult to compute), neither do we see any way to compute Kij from Yi and Yj without first obtaining either Xi or Xj.
如果q是略小于 2 b的素数,则所有量都可以表示为b位数字。求幂最多需要 2 b乘法 mod q,而根据假设,取对数需要q 1/2 = 2 b/ 2运算。因此,密码分析工作相对于合法工作呈指数级增长。如果b = 200,则最多需要 400 次乘法才能根据X i计算Y i或根据Y i和X j计算K ij,但取对数 mod q需要 2 100 次或大约 10 30次运算。
If q is a prime slightly less than 2b, then all quantities are representable as b bit numbers. Exponentiation then takes at most 2b multiplications mod q, while by hypothesis taking logs requires q1/2 = 2b/2 operations. The cryptanalytic effort therefore grows exponentially relative to legitimate efforts. If b = 200, then at most 400 multiplications are required to compute Yi from Xi, or Kij from Yi and Xj, yet taking logs mod q requires 2100 or approximately 1030 operations.
与密钥分发问题相比,身份验证问题可能是普遍采用电信进行商业交易的更严重的障碍。身份验证是任何涉及合同和计费的系统的核心。没有它,业务就无法运作。当前的电子认证系统无法满足纯数字、不可伪造的、消息相关的签名。它们可以防止第三方伪造,但不能防止发送器和接收器之间的纠纷。
The problem of authentication is perhaps an even more serious barrier to the universal adoption of telecommunications for business transactions than the problem of key distribution. Authentication is at the heart of any system involving contracts and billing. Without it, business cannot function. Current electronic authentication systems cannot meet the need for a purely digital, unforgeable, message dependent signature. They provide protection against third party forgeries, but do not protect against disputes between transmitter and receiver.
为了开发一种能够用某种纯电子通信形式取代当前书面合同的系统,我们必须发现一种与书面签名具有相同属性的数字现象。任何人都必须很容易识别签名的真实性,但除了合法签名者之外的任何人都不可能生成它。我们将任何此类技术称为单向身份验证。由于任何数字信号都可以被精确复制,因此真实的数字签名必须是可识别的且不为人所知。
In order to develop a system capable of replacing the current written contract with some purely electronic form of communication, we must discover a digital phenomenon with the same properties as a written signature. It must be easy for anyone to recognize the signature as authentic, but impossible for anyone other than the legitimate signer to produce it. We will call any such technique one-way authentication. Since any digital signal can be copied precisely, a true digital signature must be recognizable without being known.
考虑多用户计算机系统中的“登录”问题。当设置他的帐户时,用户选择一个密码,该密码被输入到系统的密码目录中。每次登录时,系统都会再次要求用户提供密码。通过对所有其他用户保密此密码,可以防止伪造登录。然而,这使得保护密码目录的安全变得至关重要,因为它包含的信息将允许完美模拟任何用户。如果系统操作员有合法理由访问目录,问题就会进一步复杂化。允许此类合法访问但阻止所有其他访问几乎是不可能的。
Consider the “login” problem in a multiuser computer system. When setting up his account, the user chooses a password which is entered into the system’s password directory. Each time he logs in, the user is again asked to provide his password. By keeping this password secret from all other users, forged logins are prevented. This, however, makes it vital to preserve the security of the password directory since the information it contains would allow perfect impersonation of any user. The problem is further compounded if system operators have legitimate reasons for accessing the directory. Allowing such legitimate accesses, but preventing all others, is next to impossible.
这导致需要一个新的登录程序来判断密码的真实性而无需真正知道密码,这显然是不可能的。虽然从逻辑上看似乎不可能,但这个提议很容易得到满足。当用户第一次输入密码PW时,计算机会自动且透明地计算函数f ( PW ) 并将其(而不是PW )存储在密码目录中。每次连续登录时,计算机都会计算f ( X )(其中X是提供的密码),并将f ( X ) 与存储的值f ( PW ) 进行比较。当且仅当它们相等时,用户才被认为是真实的。由于函数f每次登录必须计算一次,因此其计算时间必须很小。100 万条指令(按 200 年价格计算,成本约为 0.10 美元)似乎是此计算的合理限制。然而,如果我们能够确保f -1的计算需要 10 30 条或更多指令,那么破坏系统以获取密码目录的人实际上无法从f ( PW ) 获取PW,因此无法执行未经授权的操作。登录。请注意,登录程序不会接受f ( PW ) 作为密码,因为它会自动计算f ( f ( PW )),该 f ( f ( PW )) 与密码目录中的条目f ( PW )不匹配。
This leads to the apparently impossible requirement for a new login procedure capable of judging the authenticity of passwords without actually knowing them. While appearing to be a logical impossibility, this proposal is easily satisfied. When the user first enters his password PW, the computer automatically and transparently computes a function f(PW) and stores this, not PW, in the password directory. At each successive login, the computer calculates f(X), where X is the proffered password, and compares f(X) with the stored value f(PW). If and only if they are equal, the user is accepted as being authentic. Since the function f must be calculated once per login, its computation time must be small. A million instructions (costing approximately $0.10 at bicentennial prices) seems to be a reasonable limit on this computation. If we could ensure, however, that calculation of f−1 required 1030 or more instructions, someone who had subverted the system to obtain the password directory could not in practice obtain PW from f(PW), and could thus not perform an unauthorized login. Note that f(PW) is not accepted as a password by the login program since it will automatically compute f(f(PW)) which will not match the entry f(PW) in the password directory.
我们假设函数f是公开信息,因此并不是对f的无知导致f -1的计算变得困难。此类函数称为单向函数,最早由 RM Needham 在登录过程中使用(Wilkes,1972,第 91 页)。最近的两篇论文(Evans et al., 1974; Purdy, 1974)也对它们进行了讨论,其中提出了设计单向函数的有趣方法。
We assume that the function f is public information, so that it is not ignorance of f which makes calculation of f−1 difficult. Such functions are called one-way functions and were first employed for use in login procedures by R. M. Needham (Wilkes, 1972, p. 91). They are also discussed in two recent papers (Evans et al., 1974; Purdy, 1974) which suggest interesting approaches to the design of one-way functions.
更准确地说,函数f是单向函数,如果对于f域中的任何参数x,很容易计算相应的值f ( x ),然而,对于f范围内的几乎所有y,它对于任何合适的参数x求解方程y = f ( x ) 在计算上是不可行的。
More precisely, a function f is a one-way function if, for any argument x in the domain of f, it is easy to compute the corresponding value f(x), yet, for almost all y in the range of f, it is computationally infeasible to solve the equation y = f(x) for any suitable argument x.
值得注意的是,我们定义的函数从计算的角度来看是不可逆的,但其不可逆性与数学中通常遇到的函数完全不同。当点y的反函数不唯一时(即,存在不同的点x 1和x 2使得f ( x 1 ) = y = f ( x 2 )),函数f通常被称为“不可逆”。我们强调,这不是所需要的那种反演难度。相反,在给定y值和f知识的情况下,计算任何具有f ( x ) = y属性的x一定是极其困难的。事实上,如果f在通常意义上是不可逆的,那么寻找逆像的任务可能会更容易。在极端情况下,如果对于定义域中的所有x , f ( x ) = y 0 ,那么f的范围是{ y 0 },并且我们可以将任何x视为f −1 ( y 0 )。因此,f不能太简并。小程度的简并性是可以容忍的,并且正如后面所讨论的,可能存在于最有前途的一类单向函数中。
It is important to note that we are defining a function which is not invertible from a computational point of view, but whose noninvertibility is entirely different from that normally encountered in mathematics. A function f is normally called “noninvertible” when the inverse of a point y is not unique (i.e., there exist distinct points x1 and x2 such that f(x1) = y = f(x2)). We emphasize that this is not the sort of inversion difficulty that is required. Rather, it must be overwhelmingly difficult, given a value y and knowledge of f, to calculate any x whatsoever with the property that f(x) = y. Indeed, if f is noninvertible in the usual sense, it may make the task of finding an inverse image easier. In the extreme, if f(x) = y0 for all x in the domain, then the range of f is {y0}, and we can take any x as f−1(y0). It is therefore necessary that f not be too degenerate. A small degree of degeneracy is tolerable and, as discussed later, is probably present in the most promising class of one-way functions.
多项式提供了单向函数的基本示例。找到多项式方程p ( x ) = y的根x 0比计算x = x 0处的多项式p ( x )困难得多。Purdy (1974) 建议在有限域上使用非常高次的稀疏多项式,这似乎具有非常高的解与评估时间的比率。单向函数的理论基础在第42.6节中进行了更详细的讨论。而且,如第42.5节所示,单向函数在实践中很容易设计。
Polynomials offer an elementary example of one-way functions. It is much harder to find a root x0 of the polynomial equation p(x) = y than it is to evaluate the polynomial p(x) at x = x0. Purdy (1974) has suggested the use of sparse polynomials of very high degree over finite fields, which appear to have very high ratios of solution to evaluation time. The theoretical basis for one-way functions is discussed at greater length in §42.6. And, as shown in §42.5, one-way functions are easy to devise in practice.
单向功能登录协议仅解决多用户系统中出现的部分问题。它可以防止系统的身份验证数据在不使用时被泄露,但仍然要求用户向系统发送真实密码。必须通过额外的加密来防止窃听,并且完全不存在针对争议威胁的保护。
The one-way function login protocol solves only some of the problems arising in a multiuser system. It protects against compromise of the system’s authentication data when it is not in use, but still requires the user to send the true password to the system. Protection against eavesdropping must be provided by additional encryption, and protection against the threat of dispute is absent altogether.
公钥密码系统可用于产生真正的单向认证系统,如下所示。如果用户A希望向用户B发送消息M,他用自己的秘密解密密钥“解密”该消息并发送D A ( M )。当用户B收到它时,他可以读取它,并通过使用用户A的公共加密密钥E A对其进行“加密”来确保其真实性。B还保存D A ( M ) 作为消息来自A 的证据。任何人都可以通过使用众所周知的操作E A对D A ( M ) 进行操作来恢复M来验证这一说法。由于只有A可以生成具有此属性的消息,因此单向身份验证问题的解决方案将立即从公钥密码系统的开发中得到解决。
A public key cryptosystem can be used to produce a true one-way authentication system as follows. If user A wishes to send a message M to user B, he “deciphers” it in his secret deciphering key and sends DA(M). When user B receives it, he can read it, and be assured of its authenticity by “enciphering” it with user A’s public enciphering key EA. B also saves DA(M) as proof that the message came from A. Anyone can check this claim by operating on DA(M) with the publicly known operation EA to recover M. Since only A could have generated a message with this property, the solution to the one-way authentication problem would follow immediately from the development of public key cryptosystems.
单向消息身份验证有一种由马萨诸塞州计算机协会的 Leslie Lamport 向作者建议的部分解决方案。该技术采用单向函数f将k维二进制空间映射到其自身,k约为 100。如果发送器希望发送N位消息,他会生成 2 N随机选择的k维二进制向量x 1 , X 1 , x 2 , X 2 , … , x N , X N,他对此保密。接收者在f下得到相应的图像,即y 1 , Y 1 , y 2 , Y 2 , … , y N , Y N。随后,当要发送消息m = ( m 1 , m 2 , … , m N ) 时,发送器根据 m 1 = 0 或 1 发送x 1或X 1。他根据m 1 = 0或 1 发送x 2或X 2 m 2 = 0还是1 等。接收器对第一个接收到的块使用f进行操作,看看它是否产生y 1或Y 1作为其图像,从而了解它是x 1还是X 1,并且无论m 1 = 0还是1。以类似的方式,接收器能够确定m 2、m 3、...、m N。但接收方无法伪造m的哪怕一位变化。
One-way message authentication has a partial solution suggested to the authors by Leslie Lamport of Massachusetts Computer Associates. This technique employs a one-way function f mapping k-dimensional binary space into itself for k on the order of 100. If the transmitter wishes to send an N bit message he generates 2N, randomly chosen, k-dimensional binary vectors x1, X1, x2, X2, …, xN, XN which he keeps secret. The receiver is given the corresponding images under f, namely y1, Y1, y2, Y2, …, yN, YN. Later, when the message m = (m1, m2, …, mN) is to be sent, the transmitter sends x1 or X1 depending on whether m1 = 0 or 1. He sends x2 or X2 depending on whether m2 = 0 or 1, etc. The receiver operates with f on the first received block and sees whether it yields y1 or Y1 as its image and thus learns whether it was x1 or X1, and whether m1 = 0 or 1. In a similar manner the receiver is able to determine m2, m3, …, mN. But the receiver is incapable of forging a change in even one bit of m.
这只是部分解决方案,因为需要大约 100 倍的数据扩展。然而,有一种修改可以消除当N大约为兆位或更大时的扩展问题。令g为从二进制N空间到二进制n空间的单向映射,其中n约为 50。采用N位消息m并用g对其进行运算以获得n位向量m '。然后使用前面的方案发送m ′。如果N = 10 6、n = 50 且k = 100,则会向消息添加kn = 5000 个身份验证位。因此,在传输过程中仅需要 5% 的数据扩展(如果包括y 1 , Y 1 , … , y N , Y N的初始交换,则需要 15% )。即使有大量其他消息(平均2 N − n )具有相同的认证序列, g的单向性使得它们在计算上无法找到并因此无法伪造。实际上g必须比正常的单向函数强一些,因为对手不仅有m ' 还有它的逆像m之一。即使给定m也很难找到m ' 的不同逆像。找到这样的函数似乎没什么麻烦(参见§42.5 )。
This is only a partial solution because of the approximately 100-fold data expansion required. There is, however, a modification which eliminates the expansion problem when N is roughly a megabit or more. Let g be a one-way mapping from binary N-space to binary n-space where n is approximately 50. Take the N bit message m and operate on it with g to obtain the n bit vector m′. Then use the previous scheme to send m′. If N = 106, n = 50, and k = 100, this adds kn = 5000 authentication bits to the message. It thus entails only a 5 percent data expansion during transmission (or 15 percent if the initial exchange of y1, Y1, …,yN, YN is included). Even though there are a large number of other messages (2N−n on the average) with the same authentication sequence, the one-wayness of g makes them computationally infeasible to find and thus to forge. Actually g must be somewhat stronger than a normal one-way function, since an opponent has not only m′ but also one of its inverse images m. It must be hard even given m to find a different inverse image of m′. Finding such functions appears to offer little trouble (see §42.5).
对于单向用户身份验证问题还有另一种部分解决方案。用户生成一个他保密的密码X。他给出了系统f T ( X ),其中f是单向函数。在时间t,适当的验证器是f T − t ( X ),系统可以通过应用f t ( X ) 来检查它。由于f的单向性,过去的响应对于形成新的响应没有任何价值。该解决方案的问题在于,合法登录可能需要大量计算(尽管比伪造登录少许多数量级)。例如,如果t每秒递增,并且系统必须对每个密码工作一个月,则T = 260 万。用户和系统每次登录都必须平均迭代 130 万次。虽然并非无法克服,但这个问题显然限制了该技术的使用。如果能够找到一种简单的计算f (2 n )的方法(对于n = 1,,2, … ),就像X 8 = (( X 2 ) 2 ) 2那样,这个问题就可以得到解决。因此,T − t和t的二元分解将允许快速计算f T − t和f t。然而, f n的快速计算可能会阻止f成为单向的。
There is another partial solution to the one-way user authentication problem. The user generates a password X which he keeps secret. He gives the system fT(X), where f is a one-way function. At time t the appropriate authenticator is fT−t(X), which can be checked by the system by applying ft(X). Because of the one-wayness of f, past responses are of no value in forging a new response. The problem with this solution is that it can require a fair amount of computation for legitimate login (although many orders of magnitude less than for forgery). If for example t is incremented every second and the system must work for one month on each password then T = 2.6 million. Both the user and the system must then iterate f an average of 1.3 million times per login. While not insurmountable, this problem obviously limits use of the technique. The problem could be overcome if a simple method for calculating f(2n) , for n = 1,,2, … could be found, much as X8 = ((X2)2)2. For then binary decompositions of T − t and t would allow rapid computation of fT−t and ft. It may be, however, that rapid computation of fn precludes f from being one-way.
在本节中,我们将证明迄今为止提出的一些密码问题可以简化为其他问题,从而根据难度定义松散的排序。我们还介绍了更困难的活板门问题。
In this section, we will show that some of the cryptographic problems presented thus far can be reduced to others, thereby defining a loose ordering according to difficulty. We also introduce the more difficult problem of trap doors.
在第42.2节中,我们展示了旨在保护隐私的加密系统也可用于提供针对第三方伪造的身份验证。这样的系统也可用于创建其他加密对象。
In §42.2 we showed that a cryptographic system intended for privacy can also be used to provide authentication against third party forgeries. Such a system can be used to create other cryptographic objects, as well.
可以使用一种能够抵御已知明文攻击的密码系统来产生单向函数。
A cryptosystem which is secure against a known plaintext attack can be used to produce a one-way function.
如图42.3所示,采用可安全抵御已知明文攻击的密码系统 { S K : { P } → { C }} K ∈{ K } ,固定P = P 0并考虑映射f : { K } → { C } 由f ( X ) = S X ( P 0 )定义。
As indicated in Figure 42.3, take the cryptosystem {SK : {P} → {C}}K∈{K} which is secure against a known plaintext attack, fix P = P0 and consider the map f : {K} → {C} defined by f(X) = SX(P0).
该函数是单向的,因为给定f ( X ) 求解X相当于从单个已知明文密码对中查找密钥的密码分析问题。f的公共知识现在等同于 { SK } 和P 0的公共知识。
This function is one-way because solving for X given f(X) is equivalent to the cryptanalytic problem of finding the key from a single known plaintext-cryptogram pair. Public knowledge of f is now equivalent to public knowledge of {SK} and P0.
图 42.3: 用作单向函数的安全密码系统
Figure 42.3: Secure cryptosystem used as one-way function
虽然这一结果的反面不一定成立,但最初在搜索单向函数时发现的函数有可能产生良好的密码系统。这实际上发生在第42.3节中讨论的离散指数函数(Pohlig 和 Hellman,1978)。
While the converse of this result is not necessarily true, it is possible for a function originally found in the search for one-way functions to yield a good cryptosystem. This actually happened with the discrete exponential function discussed in §42.3 (Pohlig and Hellman, 1978).
单向函数是分组密码和密钥生成器的基础。密钥生成器是一种伪随机位生成器,其输出(即密钥流)与以二进制形式表示的消息相加模 2,模仿一次性密码本。密钥用作确定伪随机密钥流序列的“种子”。因此,已知的明文攻击简化为从密钥流确定密钥的问题。为了使系统安全,从密钥流计算密钥必须在计算上不可行。然而,为了使系统可用,从密钥计算密钥流的计算必须简单。因此,根据定义,一个好的密钥生成器几乎是一种单向函数。
One-way functions are basic to both block ciphers and key generators. A key generator is a pseudorandom bit generator whose output, the keystream, is added modulo 2 to a message represented in binary form, in imitation of a one-time pad. The key is used as a “seed” which determines the pseudorandom keystream sequence. A known plaintext attack thus reduces to the problem of determining the key from the keystream. For the system to be secure, computation of the key from the keystream must be computationally infeasible. While, for the system to be usable, calculation of the keystream from the key must be computationally simple. Thus a good key generator is, almost by definition, a one-way function.
使用任一类型的密码系统作为单向函数都会遇到一个小问题。如前所述,如果函数f不是唯一可逆的,则没有必要(或不可能)找到所使用的X的实际值。相反,任何具有相同图像的X就足够了。并且,虽然密码系统中的每个映射S K必须是双射的,但上面定义的从密钥到密码的函数f没有这样的限制。事实上,保证密码系统具有这种属性似乎相当困难。在一个好的密码系统中,映射f可以预期具有随机选择映射的特征(即,f ( X i ) 是从所有可能的Y中统一选择的,并且连续的选择是独立的)。在这种情况下,如果统一选择X并且有相同数量的密钥和消息( X和Y ),则所得Y具有k + 1 个逆的概率约为e -1 /k!对于k = 0, 1, 2, 3, ...。这是泊松分布,平均值λ = 1,平移 1 个单位。因此,预期的逆数仅为 2。虽然f可能更加退化,但一个好的密码系统不会太退化,因为那样密钥就没有得到很好的使用。在最坏的情况下,如果对于某些Y 0 f ( X ) == Y 0,我们有S K ( P 0 ) = C 0,并且P 0的加密根本不依赖于密钥!
Use of either type of cryptosystem as a one way function suffers from a minor problem. As noted earlier, if the function f is not uniquely invertible, it is not necessary (or possible) to find the actual value of X used. Rather any X with the same image will suffice. And, while each mapping SK in a cryptosystem must be bijective, there is no such restriction on the function f from key to cryptogram defined above. Indeed, guaranteeing that a cryptosystem has this property appears quite difficult. In a good cryptosystem the mapping f can be expected to have the characteristics of a randomly chosen mapping (i.e., f(Xi) is chosen uniformly from all possible Y, and successive choices are independent). In this case, if X is chosen uniformly and there are an equal number of keys and messages (X and Y), then the probability that the resultant Y has k + 1 inverses is approximately e−1/k! for k = 0, 1, 2, 3, …. This is a Poisson distribution with mean λ = 1, shifted by 1 unit. The expected number of inverses is thus only 2. While it is possible for f to be more degenerate, a good cryptosystem will not be too degenerate since then the key is not being well used. In the worst case, if f(X) ≡ Y0 for some Y0, we have SK(P0) = C0, and encipherment of P0 would not depend on the key at all!
虽然我们通常对定义域和值域大小相当的函数感兴趣,但也有例外。在上一节中,我们需要一个单向函数将长字符串映射到更短的字符串。通过使用密钥长度大于块大小的分组密码,可以使用上述技术来获得这样的函数。
While we are usually interested in functions whose domain and range are of comparable size, there are exceptions. In the previous section we required a one-way function mapping long strings onto much shorter ones. By using a block cipher whose key length is larger than the blocksize, such functions can be obtained using the above technique.
埃文斯等人。(1974) 对于从分组密码构造单向函数的问题有一种不同的方法。他们没有选择固定的P 0作为输入,而是使用函数f ( X ) = S X ( X )。这是一种很有吸引力的方法,因为即使S族相对简单,这种形式的方程通常也很难求解。然而,这种增加的复杂性破坏了系统S在已知明文攻击下的安全性与f的单向性之间的等价性。
Evans et al. (1974) have a different approach to the problem of constructing a one-way function from a block cipher. Rather than selecting a fixed P0 as the input, they use the function f(X) = SX(X). This is an attractive approach because equations of this form are generally difficult to solve, even when the family S is comparatively simple. This added complexity, however, destroys the equivalence between the security of the system S under a known plaintext attack and the onewayness of f.
另一种关系已在第42.4节中显示。
Another relationship has already been shown in §42.4.
公钥密码系统可用于生成单向认证系统。
A public key cryptosystem can be used to generate a one-way authentication system.
相反的情况似乎并不成立,这使得公钥密码系统的构建成为一个比单向身份验证严格上更困难的问题。类似地,公钥密码系统可以用作公钥分发系统,但反之则不然。
The converse does not appear to hold, making the construction of a public key cryptosystem a strictly more difficult problem than one-way authentication. Similarly, a public key cryptosystem can be used as a public key distribution system, but not conversely.
由于在公钥密码系统中,使用E和D 的通用系统必须是公开的,因此指定E指定将输入消息转换为输出密码的完整算法。因为这样的公钥系统实际上是一组活板门单向函数。这些函数实际上并不是单向的,因为存在简单计算的逆函数。但是给定前向函数的算法,在计算上不可能找到简单计算的逆函数。只有通过了解某些陷门信息(例如,产生E - D对的随机位串),才能轻松找到易于计算的逆。
Since in a public key cryptosystem the general system in which E and D are used must be public, specifying E specifies a complete algorithm for transforming input messages into output cryptograms. As such a public key system is really a set of trap-door one-way functions. These are functions which are not really one-way in that simply computed inverses exist. But given an algorithm for the forward function it is computationally infeasible to find a simply computed inverse. Only through knowledge of certain trap-door information (e.g., the random bit string which produced the E-D pair) can one easily find the easily computed inverse.
活板门已经在前一段中以活板门单向函数的形式出现过,但还存在其他变体。活板门密码是一种强烈抵制任何不掌握密码设计中使用的活板门信息的人进行密码分析的密码。这使得设计者可以在将系统出售给客户后对其进行破坏,但却错误地维持了他作为安全系统构建者的声誉。值得注意的是,设计者能够做到别人做不到的事情,并不是因为他有更高的聪明才智或密码学知识。如果他失去了活板门的信息,他的处境并不比其他人好。这种情况与密码锁非常相似。任何知道密码的人都可以在几秒钟内完成即使是熟练的锁匠也需要数小时才能完成的工作。然而,如果他忘记了组合,他就没有优势了。
Trap doors have already been seen in the previous paragraph in the form of trap-door one-way functions, but other variations exist. A trap-door cipher is one which strongly resists cryptanalysis by anyone not in possession of trap-door information used in the design of the cipher. This allows the designer to break the system after he has sold it to a client and yet falsely to maintain his reputation as a builder of secure systems. It is important to note that it is not greater cleverness or knowledge of cryptography which allows the designer to do what others cannot. If he were to lose the trap-door information he would be no better off than anyone else. The situation is precisely analogous to a combination lock. Anyone who knows the combination can do in seconds what even a skilled locksmith would require hours to accomplish. And yet, if he forgets the combination, he has no advantage.
陷门密码系统可用于产生公钥分发系统。
A trap-door cryptosystem can be used to produce a public key distribution system.
为了让A和B建立公共私钥,A随机选择一个密钥并向B发送任意明文密码对。B公开了活板门密码,但对活板门信息保密,使用明文-密码对来求解密钥。A和B现在有一个共同的密钥。
For A and B to establish a common private key, A chooses a key at random and sends an arbitrary plaintext-cryptogram pair to B. B, who made the trap-door cipher public, but kept the trap-door information secret, uses the plaintext-cryptogram pair to solve for the key. A and B now have a key in common.
目前几乎没有证据表明活板门密码的存在。然而,它们是一种明显的可能性,在接受来自可能对手的密码系统时应该记住它们(Diffie 和 Hellman,1977)。
There is currently little evidence for the existence of trap-door ciphers. However they are a distinct possibility and should be remembered when accepting a cryptosystem from a possible opponent (Diffie and Hellman, 1977).
根据定义,我们要求活板门问题是设计活板门在计算上可行的问题。这为第三种类型的实体留下了空间,我们将使用前缀“准”。例如,准单向函数不是单向的,因为存在容易计算的逆。然而,即使对于设计者来说,找到容易计算的逆函数在计算上也是不可行的。因此,可以使用准单向函数来代替单向函数,而基本上不会损失安全性。
By definition, we will require that a trap-door problem be one in which it is computationally feasible to devise the trap door. This leaves room for yet a third type of entity for which we shall use the prefix “quasi.” For example a quasi one-way function is not one-way in that an easily computed inverse exists. However, it is computationally infeasible, even for the designer, to find the easily computed inverse. Therefore a quasi one-way function can be used in place of a one-way function with essentially no loss in security.
将陷门信息丢失给陷门单向函数使其成为准单向函数,但也可能存在以这种方式无法获得的单向函数。
Losing the trap-door information to a trap-door one-way function makes it into a quasi one-way function, but there may also be one-way functions not obtainable in this manner.
准单向函数被排除在单向函数类别之外完全是一个定义问题。人们可以谈论广义或严格意义上的单向函数。
It is entirely a matter of definition that quasi one-way functions are excluded from the class of one-way functions. One could instead talk of one-way functions in the wide sense or in the strict sense.
类似地,准安全密码是一种即使是其设计者也能成功抵抗密码分析的密码,但仍存在计算有效的密码分析算法(这当然在计算上是不可能找到的)。同样,从实践的角度来看,安全密码和准安全密码本质上没有区别。
Similarly, a quasi secure cipher is a cipher which will successfully resist cryptanalysis, even by its designer, and yet for which there exists a computationally efficient cryptanalytic algorithm (which is of course computationally infeasible to find). Again, from a practical point of view, there is essentially no difference between a secure cipher and a quasi secure one.
我们已经看到,公钥密码系统意味着陷门单向函数的存在。但反之则不然。对于可用作公钥密码系统的陷门单向函数,它必须是可逆的(即具有唯一的逆)。
We have already seen that public key cryptosystems imply the existence of trap-door one-way functions. However the converse is not true. For a trap-door one-way function to be usable as a public key cryptosystem, it must be invertible (i.e., have a unique inverse).
密码学与所有其他领域的不同之处在于其要求似乎很容易得到满足。简单的转换会将清晰的文本转换为明显无意义的混乱。希望声称意义可能通过密码分析恢复的批评家,如果他想证明他的观点是正确的,那么他将面临艰巨的论证。然而,经验表明,很少有系统能够抵御熟练的密码分析者的协同攻击,并且许多所谓的安全系统随后都被攻破了。
Cryptography differs from all other fields of endeavor in the ease with which its requirements may appear to be satisfied. Simple transformations will convert a legible text into an apparently meaningless jumble. The critic, who wishes to claim that meaning might yet be recovered by cryptanalysis, is then faced with an arduous demonstration if he is to prove his point of view correct. Experience has shown, however, that few systems can resist the concerted attack of skillful cryptanalysts, and many supposedly secure systems have subsequently been broken.
因此,判断新系统的价值一直是密码学家关注的中心问题。在十六和十七世纪,数学论证经常被用来论证密码方法的强度,通常依赖于显示可能密钥的天文数字的计数方法。尽管这个问题很难通过如此简单的方法得到解决,但即使是著名的代数学家卡尔达诺也陷入了这个陷阱(Kahn,1967,第 145 页)。由于其强度被如此争论的系统一再被破坏,为系统安全性提供数学证明的概念声名狼藉,并被密码分析攻击的认证所取代。
In consequence of this, judging the worth of new systems has always been a central concern of cryptographers. During the sixteenth and seventeenth centuries, mathematical arguments were often invoked to argue the strength of cryptographic methods, usually relying on counting methods which showed the astronomical number of possible keys. Though the problem is far too difficult to be laid to rest by such simple methods, even the noted algebraist Cardano fell into this trap (Kahn, 1967, p. 145). As systems whose strength had been so argued were repeatedly broken, the notion of giving mathematical proofs for the security of systems fell into disrepute and was replaced by certification via crypanalytic assault.
然而,在本世纪,钟摆开始向另一个方向摆回。Shannon(1949)在一篇与信息论的诞生密切相关的论文中表明,自二十年代末开始使用的一次性便笺本系统提供了“完美的保密性”(一种无条件安全的形式)。香农研究的可证明安全的系统依赖于使用长度随消息长度线性增长的密钥或完美的源编码,因此对于大多数用途来说都太笨重了。我们注意到,公钥密码系统和单向认证系统都不可能是无条件安全的,因为公共信息总是唯一地决定秘密信息。有限集的成员。由于计算量不受限制,因此可以通过简单的搜索来解决问题。
During this century, however, the pendulum has begun to swing back in the other direction. In a paper intimately connected with the birth of information theory, Shannon (1949) showed that the one time pad system, which had been in use since the late twenties offered “perfect secrecy” (a form of unconditional security). The provably secure systems investigated by Shannon rely on the use of either a key whose length grows linearly with the length of the message or on perfect source coding and are therefore too unwieldy for most purposes. We note that neither public key cryptosystems nor one-way authentication systems can be unconditionally secure because the public information always determines the secret information uniquely among the members of a finite set. With unlimited computation, the problem could therefore be solved by a straightforward search.
在过去的十年里,两个密切相关的学科兴起,致力于研究计算成本:计算复杂性理论和算法分析。前者按难度将计算中的已知问题分为大类,而后者则专注于寻找更好的算法并研究它们消耗的资源。在简要介绍复杂性理论之后,我们将研究其在密码学中的应用,特别是单向函数的分析。
The past decade has seen the rise of two closely related disciplines devoted to the study of the costs of computation: computational complexity theory and the analysis of algorithms. The former has classified known problems in computing into broad classes by difficulty, while the latter has concentrated on finding better algorithms and studying the resources they consume. After a brief digression into complexity theory, we will examine its application to cryptography, particularly the analysis of one-way functions.
如果一个函数可以由确定性图灵机在其输入长度的某个多项式函数所限制的时间内计算出来,则该函数被认为属于复杂性类𝒫 (多项式)。人们可能会认为这是一类容易计算的函数,但更准确的说法是,不属于此类的函数至少对于某些输入来说必须难以计算。有些问题已知不属于𝒫类(Aho et al., 1974, pp. 405–425)。
A function is said to belong to the complexity class 𝒫 (for polynomial) if it can be computed by a deterministic Turing Machine in a time which is bounded above by some polynomial function of the length of its input. One might think of this as the class of easily computed functions, but it is more accurate to say that a function not in this class must be hard to compute for at least some inputs. There are problems which are known not to be in the class 𝒫 (Aho et al., 1974, pp. 405–425).
工程中出现的许多问题无法通过任何已知技术在多项式时间内解决,除非它们在具有无限并行度的计算机上运行。这些问题可能属于也可能不属于𝒫类,但属于𝒩 𝒫类(非确定性多项式),即在“非确定性”计算机(即具有无限并行度的计算机)上可在多项式时间内解决的问题。显然,类𝒩 𝒫包含类𝒫,复杂性理论中最大的开放问题之一是类𝒩 𝒫是否严格更大。
There are many problems which arise in engineering which cannot be solved in polynomial time by any known techniques, unless they are run on a computer with an unlimited degree of parallelism. These problems may or may not belong to the class 𝒫, but belong to the class 𝒩𝒫 (for nondeterministic, polynomial) of problems solvable in polynomial time on a “nondeterministic” computer (i.e., one with an unlimited degree of parallelism). Clearly the class 𝒩𝒫 includes the class 𝒫, and one of the great open questions in complexity theory is whether the class 𝒩𝒫 is strictly larger.
已知可在𝒩 𝒫时间内解决但未知可在𝒫时间内解决的问题包括旅行商问题、命题微积分的可满足性问题、背包问题、图着色问题以及许多调度和调度问题。最小化问题(Karp,1972;Aho 等,1974,第 363-404 页)。我们看到,并不是缺乏兴趣或努力阻碍了人们及时找到这些问题的解决方案。因此,我们坚信这些问题中至少有一个一定不属于类𝒫,因此类𝒩 𝒫严格来说更大。
Among the problems known to be solvable in 𝒩𝒫 time, but not known to be solvable in 𝒫 time, are versions of the traveling salesman problem, the satisfiability problem for propositional calculus, the knapsack problem, the graph coloring problem, and many scheduling and minimization problems (Karp, 1972; Aho et al., 1974, pp. 363–404). We see that it is not lack of interest or effort which has prevented people from finding solutions in 𝒫 time for these problems. It is thus strongly believed that at least one of these problems must not be in the class 𝒫, and that therefore the class 𝒩𝒫 is strictly larger.
Karp 确定了𝒩 𝒫问题的一个子类,称为𝒩 𝒫完整问题,其属性是,如果其中任何一个问题在𝒩中,则所有𝒩 𝒫问题都在𝒫中。Karp 列出了 21 个完整的问题,包括上面提到的所有问题(Karp,1972,这里第 36 章)。
Karp has identified a subclass of the 𝒩𝒫 problems, called 𝒩𝒫 complete, with the property that if any one of them is in 𝒫, then all 𝒩𝒫 problems are in 𝒫. Karp lists 21 problems which are 𝒩𝒫 complete, including all of the problems mentioned above (Karp, 1972, here chapter 36).
虽然𝒩 𝒫完整的问题显示出密码学应用的前景,但目前对其难度的理解仅包括最坏情况分析。出于加密目的,必须考虑典型的计算成本。然而,如果我们用平均或典型计算时间替换最坏情况计算时间作为我们的复杂性度量,则𝒩 𝒫完整问题之间的等价性的当前证明不再有效。这提出了几个有趣的研究主题。信息理论家熟悉的系综和典型性概念具有明显的作用。
While the 𝒩𝒫 complete problems show promise for cryptographic use, current understanding of their difficulty includes only worst case analysis. For cryptographic purposes, typical computational costs must be considered. If, however, we replace worst case computation time with average or typical computation time as our complexity measure, the current proofs of the equivalences among the 𝒩𝒫 complete problems are no longer valid. This suggests several interesting topics for research. The ensemble and typicality concepts familiar to information theorists have an obvious role to play.
我们现在可以确定一般密码分析问题在所有计算问题中的位置。能够在 𝒫 时间内完成加密和解密操作的系统的密码分析难度不能大于 𝒩 𝒫。
We can now identify the position of the general cryptanalytic problem among all computational problems. The cryptanalytic difficulty of a system whose encryption and decryption operations can be done in 𝒫 time cannot be greater than 𝒩𝒫.
要看到这一点,请观察任何密码分析问题都可以通过找到从有限集合中选择的密钥、逆图像等来解决。非确定性地选择密钥并在𝒫时间内验证它是否正确。如果有M 个可能的键可供选择,则必须采用M倍并行性。例如,在已知的明文攻击中,明文在每个密钥下同时被加密并与密码进行比较。由于假设加密只需要𝒫时间,因此密码分析只需要𝒩 𝒫时间。
To see this, observe that any cryptanalytic problem can be solved by finding a key, inverse image, etc., chosen from a finite set. Choose the key nondeterministically and verify in 𝒫 time that it is the correct one. If there are M possible keys to choose from, an M-fold parallelism must be employed. For example in a known plaintext attack, the plaintext is encrypted simultaneously under each of the keys and compared with the cryptogram. Since, by assumption, encryption takes only 𝒫 time, the cryptanalysis takes only 𝒩𝒫 time.
我们还观察到一般的密码分析问题是𝒩 𝒫完整的。这是根据我们对密码问题的定义的广度得出的。接下来将讨论具有𝒩 𝒫完全逆的单向函数。
We also observe that the general cryptanalytic problem is 𝒩𝒫 complete. This follows from the breadth of our definition of cryptographic problems. A one-way function with an 𝒩𝒫 complete inverse will be discussed next.
密码学可以通过检查𝒩 𝒫完整问题适应密码学使用的方式,直接从𝒩 𝒫复杂性理论中得出结论。特别是,有一个称为背包问题的𝒩 𝒫完全问题,它很容易构建单向函数。
Cryptography can draw directly from the theory of 𝒩𝒫 complexity by examining the way in which 𝒩𝒫 complete problems can be adapted to cryptographic use. In particular, there is an 𝒩𝒫 complete problem known as the knapsack problem which lends itself readily to the construction of a one-way function.
令y = f ( x ) = a · x其中a是n 个整数的已知向量( a 1 , a 2 , … , a n ) 并且x是二进制n向量。y的计算很简单,最多涉及n 个整数的和。求逆f的问题称为背包问题,需要找到 { a i } 的子集,其总和为y。
Let y = f(x) = a · x where a is a known vector of n integers (a1, a2, …, an) and x is a binary n-vector. Calculation of y is simple, involving a sum of at most n integers. The problem of inverting f is known as the knapsack problem and requires finding a subset of the {ai} which sum to y.
对所有 2 n个子集的穷举搜索呈指数增长,并且对于n大于 100 左右的情况在计算上是不可行的。然而,在选择问题参数时必须小心,以确保不可能走捷径。例如,如果n = 100并且每个a i为32位长,则y最多为39位长,并且f是高度简并的;平均只需要 2 38 次尝试找到解决方案。更简单的是,如果a i = 2 i −1则对f求逆相当于求y的二元分解。……
Exhaustive search of all 2n subsets grows exponentially and is computationally infeasible for n greater than 100 or so. Care must be exercised, however, in selecting the parameters of the problem to ensure that shortcuts are not possible. For example if n = 100 and each ai is 32 bits long, y is at most 39 bits long, and f is highly degenerate; requiring on the average only 238 tries to find a solution. Somewhat more trivially, if ai = 2i−1 then inverting f is equivalent to finding the binary decomposition of y. …
算法分析中感兴趣的另一个潜在的单向函数是求幂 mod q,它是由斯坦福大学的 John Gill 教授向作者建议的。该函数的单向性已在第42.3节中讨论过。
Another potential one-way function, of interest in the analysis of algorithms, is exponentiation mod q, which was suggested to the authors by Prof. John Gill of Stanford University. The one-wayness of this functions has already been discussed in §42.3.
……我们在密码学历史上注意到的最后一个特征是业余密码学家和专业密码学家之间的区别。生产密码分析的技能一直主要由专业人士掌握,但创新,特别是新型密码系统的设计,主要来自业余爱好者。托马斯·杰斐逊 (Thomas Jefferson) 是一名密码学爱好者,他发明了一种在第二次世界大战期间仍在使用的系统(Kahn,1967 年,第 192-195 页),而 20 世纪最著名的密码系统——转子机,也是由四个独立的人,都是业余爱好者(Kahn,1967,第 415、420、422-424 页)。我们希望这将激励其他人在这个令人着迷的领域工作,该领域的参与在最近因几乎完全的政府垄断而受到阻碍。
… The last characteristic which we note in the history of cryptography is the division between amateur and professional cryptographers. Skill in production cryptanalysis has always been heavily on the side of the professionals, but innovation, particularly in the design of new types of cryptographic systems, has come primarily from the amateurs. Thomas Jefferson, a cryptographic amateur, invented a system which was still in use in World War II (Kahn, 1967, pp. 192–195), while the most noted cryptographic system of the twentieth century, the rotor machine, was invented simultaneously by four separate people, all amateurs (Kahn, 1967, pp. 415, 420, 422–424). We hope this will inspire others to work in this fascinating area in which participation has been discouraged in the recent past by a nearly total government monopoly.
经电气和电子工程师协会许可,转载自 Diffie 和 Hellman (1976a)。
Reprinted from Diffie and Hellman (1976a), with permission from the Institute of Electrical and Electronics Engineers.
如果我断言唐纳德·克努斯(Donald Knuth,生于 1938 年,发音为“Ka-NOOTH”)是有史以来最伟大的计算机科学家,我可能会遭到争论,就像有人可能会抗议威利·梅斯是一位比不朽者更伟大的棒球运动员一样贝比鲁斯。但没有人会说我把高德纳放在第一位是疯狂的。在其漫长而独特的职业生涯中,Knuth 为该领域做出了露丝般的贡献。
Were I to assert that Donald Knuth (b. 1938, pronounced “Ka-NOOTH”) is the greatest computer scientist of all time, I might get an argument, just as someone might protest that Willy Mays is a greater baseball player than the immortal Babe Ruth. But nobody would say I was crazy for putting Knuth first. Over his long and unique career, Knuth has made Ruthian contributions to the field.
1963 年,Knuth 在加州理工学院获得博士学位,随后开始在加州理工学院任教。他关于线性探测散列的笔记构成了算法的最早的数学分析之一(Knuth,1963),但从未发表。当时这种数学思考的市场是有限的。行动涉及编程系统和语言。1965 年一篇关于一类线性时间解析器(称为 LR 解析器)的论文(Knuth,1965)对该领域做出了重要贡献。(露丝一开始是一个相当不错的投手。)高德努斯开始写一本关于解析器和编译器的书。但是,由于对现有的奠定该领域基础的背景文本不满意,他同意撰写一本关于计算机科学的综述,名为《计算机编程的艺术》,简称TAOCP 。一开始有六章,然后是七章,但当第一章完成时,它已经增长到书本的长度——这在很大程度上是因为文献中描述的许多基本算法没有得到支持。直到 Knuth 提供了正确的数学分析。TAOCP第 1 卷于 1968 年出版,目前已是第三版(Knuth,1997a)。高德纳搬到了斯坦福大学(就像露丝搬到了纽约一样)。此后,第 2 卷和第 3 卷已出版并修订(Knuth,1998,1997b),第 4 卷本身已发展成为不止一本书,第 4A 卷已出版(Knuth,2011)。“分册”的发行是为了期待后续的卷和版本。
It started conventionally enough, with a PhD in 1963 from Cal Tech, where Knuth then started on the faculty. His notes on hashing with linear probing constitute one of the first mathematical analyses of an algorithm (Knuth, 1963), but were never published. The market for such mathematical musings was limited at the time; the action was in programming systems and languages. A 1965 paper on a class of linear-time parsers known as LR parsers (Knuth, 1965) was an important contribution to that field. (And Ruth started out as a pretty good pitcher.) Knuth began working on a book on parsers and compilers. But, dissatisfied with the available background texts laying out the foundations of the field, he agreed to write a one-volume survey of computer science called The Art of Computer Programming, or TAOCP for short. It was at first to be six chapters, then seven, but by the time the first chapter was finished, it had grown to book length—in no small part due to the fact that many of the basic algorithms described in the literature were not backed up by a proper mathematical analysis until Knuth provided it. Volume 1 of TAOCP appeared in 1968 and is now in its third edition (Knuth, 1997a). Knuth moved to Stanford (just as Ruth moved to New York). Since then, Volumes 2 and 3 have been published and revised (Knuth, 1998, 1997b), Volume 4 has itself grown to be more than one book, and Volume 4A has been published (Knuth, 2011). “Fascicles” are being issued in anticipation of later volumes and editions.
当高德纳 (Knuth) 撰写算法分析领域时,高德纳 (Knuth) 和他的学生正在创建算法分析领域。当他系统地学习计算机科学的数学和算法时,他无法回答的问题成为对社区的挑战,然后是论文主题,然后是整个子领域。他的博士生及其博士生的学术成果,以及他们对计算机科学学术部门的影响是惊人的。
Knuth and his students were creating the field of algorithm analysis as Knuth was writing it up. Questions he could not answer as he was systematically going through the mathematics and algorithms of computer science became challenges to the community, then thesis topics, and then entire subfields. The scholarly output of his PhD students and their PhD students, and their influence on academic departments of computer science, is staggering.
TAOCP推迟的另一个原因是,第一卷问世后,出版技术发生了变化,导致后来的版本和卷在外观上与早期的不统一。高德努斯没有接受这种审美上的冒犯,而是将注意力转向了这个问题排版,然后是字体设计和渲染。结果是 TEX(Knuth,1986b)和元字体(Knuth,1986a)用于排版本书和当今出版的许多数学文献的系统。
TAOCP was delayed also because after the appearance of the first volume, publishing technology changed, with the result that later editions and volumes would not be uniform in appearance with the earlier. Rather than accept this aesthetic offense, Knuth turned his attention to the problem of typesetting, and then of font design and rendering. The results were the TEX (Knuth, 1986b) and METAFONT (Knuth, 1986a) systems used to typeset this book and much of the mathematical literature being published today.
TAOCP不能轻易概括或提取。我没有为本书选择 Knuth 的一篇期刊论文,而是选择了这篇简短的注释,它最初以打字稿形式印刷在计算机协会理论界的时事通讯中(SIGACT 最初是指自动机和可计算性特别兴趣小组)理论)。与莱布尼茨(第 5 页)一样,高德纳对良好数学符号的传播产生了深远的影响——在这里是针对算法分析的新兴领域,并且更早地标准化了“ 𝒫 ”和“ 𝒩 𝒫 ”的使用(参见第 333 页)。尽管标题很顽皮(“omicron”和“omega”字面意思是“小 o”和“大 o”),这篇简短的注释显示了计算机科学领域可以保持的学术标准的深度,并描绘了计算机科学领域的学术标准。一幅生动的画面,描绘了这个人在工作中,多次前往斯坦福图书馆,斥责上个世纪伟大人物的鬼魂,因为他们的定义是错误的,最后以对话的语气暗示他想不出还有什么可说的,所以让我们都按照他的方式去做吧。这正是发生的事情;这些符号现在已成为世界标准。
TAOCP can’t easily be summarized or extracted. Rather than choosing one of Knuth’s journal papers for this volume, I have selected instead this short note, printed originally in typescript form in the newsletter of the theory community within the Association for Computing Machinery (SIGACT originally meant the Special Interest Group on Automata and Computability Theory). Like Leibniz (page 5), Knuth has profoundly influenced the propagation of good mathematical notation—here for the emerging field of algorithm analysis, and earlier in standardizing the usage of “𝒫” and “𝒩𝒫” (see page 333). In spite of its impish title (“omicron” and “omega” literally mean “little o” and “big o”), this short note shows the depth of the scholarly standard to which the field of computer science can be held, and paints a vivid picture of the man at work, making repeated trips to Stanford’s library, scolding the ghosts of great figures of the previous century for getting their definitions wrong, and finally suggesting in a conversational tone that he can’t think of anything else to say, so let’s all do it his way. Which is exactly what happened; these notations are now the world’s standards.
我们大多数人已经习惯了使用符号O ( f ( n )) 来代表任何函数,其大小上限为常数乘以f ( n ),对于所有大n。有时我们还需要下界函数的相应表示法,即对于所有大的n ,这些函数至少与常数乘以f ( n ) 一样大。不幸的是,人们偶尔会使用O表示法来表示下界,例如,当他们拒绝某种特定的排序方法“因为它的运行时间是O ( n 2 )”时。我经常在印刷品上看到这样的例子,最后它促使我坐下来给编辑写了一封关于这种情况的信。
MOST of us have gotten accustomed to the idea of using the notation O(f(n)) to stand for any function whose magnitude is upper-bounded by a constant times f(n), for all large n. Sometimes we also need a corresponding notation for lower-bounded functions, i.e., those functions which are at least as large as a constant times f(n) for all large n. Unfortunately, people have occasionally been using the O-notation for lower bounds, for example when they reject a particular sorting method “because its running time is O(n2).” I have seen instances of this in print quite often, and finally it has prompted me to sit down and write a Letter to the Editor about the situation.
经典文献确实对有界的函数有一个符号,即Ω ( f ( n ))。这种表示法最突出的出现是在 Titchmarsh 关于黎曼 zeta 函数的巨著中(Titchmarsh,1951),其中他在 p 上定义了Ω ( f ( n ))。152,并用整个第 8 章来讨论“ Ω定理”。另见 Prachar(1957 年,第 245 页)。
The classical literature does have a notation for functions that are bounded below, namely Ω(f(n)). The most prominent appearance of this notation is in Titchmarsh’s magnum opus on Riemann’s zeta function (Titchmarsh, 1951), where he defines Ω(f(n)) on p. 152 and devotes his entire Chapter 8 to “Ω-theorems.” See also Prachar (1957, p. 245).
Ω表示法还没有变得很常见,尽管我注意到它在一些地方使用,最近在我查阅的一些关于等分布序列理论的俄罗斯出版物中。有一次,我在一封信中建议某人使用Ω表示法,“因为它已被数论学家使用多年”;但后来,当我被要求提供明确的参考文献时,我在图书馆里花了一个小时的时间进行搜索,结果却毫无结果,却没有找到任何参考文献。我最近询问了几位著名的数学家是否知道Ω ( n 2 ) 的含义,其中一半以上以前从未见过这个符号。
The Ω notation has not become very common, although I have noticed its use in a few places, most recently in some Russian publications I consulted about the theory of equidistributed sequences. Once I had suggested to someone in a letter that he use Ω-notation “since it had been used by number theorists for years”; but later, when challenged to show explicit references, I spent a surprisingly fruitless hour searching in the library without being able to turn up a single reference. I have recently asked several prominent mathematicians if they knew what Ω(n2) meant, and more than half of them had never seen the notation before.
在写这封信之前,我决定更仔细地搜索,并研究一下O符号和O符号的历史。卡约里关于数学符号史的两卷本著作没有提到任何这些。在寻找Ω的定义时,我发现了本世纪初的数十本书,它们定义了O和o但没有定义Ω。我发现兰道的评论(兰道,1909,第883页),他所知道的O第一次出现是在巴赫曼(1894,第401页)。在同一地方,兰道说他在写关于素数分布的手册时亲自发明了o表示法;他对O和o的最初讨论是在 Landau(1909 年,第 59-62 页)中。
Before writing this letter, I decided to search more carefully, and to study the history of O-notation and o-notation as well. Cajori’s two-volume work on history of mathematical notations does not mention any of these. While looking for definitions of Ω I came across dozens of books from the early part of this century which defined O and o but not Ω. I found Landau’s remark (Landau, 1909, p. 883) that the first appearance of O known to him was in Bachmann (1894, p. 401). In the same place, Landau said that he had personally invented the o-notation while writing his handbook about the distribution of primes; his original discussion of O and o is in Landau (1909, pp. 59–62).
我在朗道的出版物中找不到任何Ω符号的出现;后来当我与乔治·波利亚讨论这个问题时,这一点得到了证实,乔治·波利亚告诉我,他是朗道的学生,并且非常熟悉他的著作。Pólya 知道Ω表示法的含义,但从未在自己的作品中使用过。(他说,有其师必有其徒。)
I could not find any appearances of Ω-notation in Landau’s publications; this was confirmed later when I discussed the question with George Pólya, who told me that he was a student of Landau’s and was quite familiar with his writings. Pólya knew what Ω-notation meant, but never had used it in his own work. (Like teacher, like pupil, he said.)
由于Ω符号很少使用,我前三次去图书馆几乎没有什么成果,但在第四次访问时,我终于能够查明它可能的起源:Hardy 和 Littlewood 在他们 1914 年的经典回忆录中介绍了 Ω(Hardy and Littlewood,1914 ),第 225 页),称其为“新”符号。他们还在关于素数分布的主要论文中使用了它(Hardy 和 Littlewood,1916,第 125 页),但他们显然发现在后来的作品中很少需要它。
Since Ω notation is so rarely used, my first three trips to the library bore little fruit, but on my fourth visit I was finally able to pinpoint its probable origin: Hardy and Littlewood introduced Ω in their classic 1914 memoir (Hardy and Littlewood, 1914, p. 225), calling it a “new” notation. They used it also in their major paper on distribution of primes (Hardy and Littlewood, 1916, pp. 125ff.), but they apparently found little subsequent need for it in later works.
不幸的是,Hardy 和 Littlewood 并没有按照我的意愿定义Ω ( f ( n ));它们的定义是o ( f ( n ))的否定,即当C是足够小的正常数时,对于无穷多个n ,其绝对值超过Cf ( n )的函数。对于我迄今为止在计算机科学中看到的所有应用程序,更强的要求(用“所有大n ”代替“无限多个n ”)更为合适。
Unfortunately, Hardy and Littlewood didn’t define Ω(f(n)) as I wanted them to; their definition was a negation of o(f(n)), namely a function whose absolute value exceeds Cf(n) for infinitely many n, when C is a sufficiently small positive constant. For all the applications I have seen so far in computer science, a stronger requirement (replacing “infinitely many n” by “all large n”) is much more appropriate.
在与人们讨论这个问题几年后,我得出的结论是,以下定义将被证明对计算机科学家最有用:
After discussing this problem with people for several years, I have come to the conclusion that the following definitions will prove to be most useful for computer scientists:
从语言上来说,O ( f ( n )) 可以理解为“阶数最多为f ( n )”;Ω ( f ( n )) 为“阶数至少为f ( n )”;θ ( f ( n )) 为“精确排序f ( n )”。当然,这些定义仅适用于n → ∞ 时的行为;当将f ( x )处理为x → 0 时,我们将用零邻域替换无穷大邻域,即 | x |≤ x 0而不是n ≥ n 0。
Verbally, O(f(n)) can be read as “order at most f(n)”; Ω(f(n)) as “order at least f(n)”; Θ(f(n)) as “order exactly f(n).” Of course, these definitions apply only to behavior as n →∞; when dealing with f(x) as x → 0 we would substitute a neighborhood of zero for the neighborhood of infinity, i.e., |x|≤ x0 instead of n ≥ n0.
虽然我改变了 Hardy 和 Littlewood 对Ω的定义,但我觉得这样做是合理的,因为他们的定义决没有广泛使用,而且因为在他们的定义适用的相对罕见的情况下,还有其他方法可以表达他们想说的内容。我喜欢Ω类比O的助记外观,而且很容易排版。此外,上面定义的这两个符号得到了θ符号的很好补充,这是 Bob Tarjan 和 Mike Paterson 独立向我建议的。
Although I have changed Hardy and Littlewood’s definition of Ω, I feel justified in doing so because their definition is by no means in wide use, and because there are other ways to say what they want to say in the comparatively rare cases when their definition applies. I like the mnemonic appearance of Ω by analogy with O, and it is easy to typeset. Furthermore, these two notations as defined above are nicely complemented by the Θ-notation, which was suggested to me independently by Bob Tarjan and by Mike Paterson.
上述定义指的是“所有g ( n )的集合,使得…… ”,而不是“具有……属性的任意函数g ( n ) ”;我相信,这个以集合为单位的定义是定义O表示法的最佳方式,这是 Ron Rivest 多年前向我建议的,作为对我的第 1 卷第一次印刷中的定义的改进。根据这种解释,当在公式中使用O表示法及其相关符号时,我们实际上是在谈论函数集而不是单个函数。当A和B是函数集合时,A + B表示集合{ a + b:a ∈ A且b ∈ B },以此类推;和“1 + O ( n −1 )”可以被认为是指形式为 1 + g ( n ) 的所有函数的集合,其中 | g ( n )|≤ Cn -1对于某些C和所有大n。这样就出现了单向等式的现象,即我们写 1 + O ( n -1 ) = O (1) 而不是O (1) = 1 + O ( n -1 )。这里的等号实际上意味着 ⊆(集合包含),这让许多人感到困扰,他们建议我们不允许在这种情况下使用 = 符号。我的感觉是,我们应该继续将单向等式与O符号一起使用,因为多年来它已成为数千名数学家的普遍做法,而且我们充分理解了现有符号的含义。
The definitions above refer to “the set of all g(n) such that …”, rather than to “an arbitrary function g(n) with the property that …”; I believe that this definition in terms of sets, which was suggested to me many years ago by Ron Rivest as an improvement over the definition in the first printing of my volume 1, is the best way to define O-notation. Under this interpretation, when the O-notation and its relatives are used in formulas, we are actually speaking about sets of functions rather than single functions. When A and B are sets of functions, A + B denotes the set {a + b: a ∈ A and b ∈ B}, etc.; and “1 + O(n−1)” can be taken to mean the set of all functions of the form 1 + g(n), where |g(n)|≤ Cn−1 for some C and all large n. The phenomenon of one-way equalities arises in this connection, i.e., we write 1 + O(n−1) = O(1) but not O(1) = 1 + O(n−1). The equal sign here really means ⊆ (set inclusion), and this has bothered many people who propose that we not be allowed to use the = sign in this context. My feeling is that we should continue to use one-way equality together with O-notations, since it has been common practice of thousands of mathematicians for so many years now, and since we understand the meaning of our existing notation sufficiently well.
我们还可以将ω ( f ( n )) 定义为与f ( n )之比无界的所有函数的集合,类比于o ( f ( n ))。就我个人而言,我觉得没有必要使用这些o符号;相反,我发现始终获得O估计是一门很好的学科,因为它教会了我更强大的数学方法。然而,我预计有一天,当面对一个我无法证明更强大的函数时,我可能不得不崩溃并使用o表示法。
We could also define ω(f(n)) as the set of all functions whose ratio to f(n) is unbounded, by analogy to o(f(n)). Personally I have felt little need for these o-notations; on the contrary, I have found it a good discipline to obtain O-estimates at all times, since it has taught me about more powerful mathematical methods. However, I expect someday I may have to break down and use o-notation when faced with a function for which I can’t prove anything stronger.
请注意,上述O、Ω和θ的定义稍微缺乏对称性,因为仅在O的情况下才对g ( n )使用绝对值符号。这并不是真正的异常,因为O指的是零邻域,而Ω指的是无穷大邻域。(哈代关于发散级数的书在需要单向O结果时使用O L和O R。Hardy和 Littlewood分别无限频繁地使用Ω L和Ω R来表示函数< − Cf ( n ) 和> Cf ( n ) 。这些都没有广泛传播。)
Note that there is a slight lack of symmetry in the above definitions of O, Ω, and Θ, since absolute value signs are used on g(n) only in the case of O. This is not really an anomaly, since O refers to a neighborhood of zero while Ω refers to a neighborhood of infinity. (Hardy’s book on divergent series uses OL and OR when a one-sided O-result is needed. Hardy and Littlewood used ΩL and ΩR for functions respectively < −Cf(n) and > Cf(n) infinitely often. Neither of these has become widespread.)
上述符号旨在用于绝大多数应用,但它们并不旨在满足所有可以想象的需求。例如,如果您正在处理像 (log log n ) cos n这样的函数,您可能需要一个表示法来表示“在 log log n和 1/log log n之间振荡的所有函数,其中这些限制是最好的。” 在这种情况下,用于此目的的本地符号(仅限于您当时正在撰写的任何论文的页面)就足够了;不必担心概念的标准符号,除非该概念经常出现。
The above notations are intended to be useful in the vast majority of applications, but they are not intended to meet all conceivable needs. For example, if you are dealing with a function like (log log n)cos n you might want a notation for “all functions which oscillate between log log n and 1/log log n where these limits are best possible.” In such a case, a local notation for the purpose, confined to the pages of whatever paper you are writing at the time, should suffice; it isn’t necessary to worry about standard notations for a concept unless that concept arises frequently.
我想通过讨论一种表示函数增长顺序的竞争方式来结束这封信。我的图书馆研究发现了一个令人惊讶的事实,即这种替代方法实际上早于O表示法本身。Paul du Bois-Reymond (1870) 使用关系符号
I would like to close this letter by discussing a competing way to denote the order of function growth. My library research turned up the surprising fact that this alternative approach actually antedates the O-notation itself. Paul du Bois-Reymond (1870) used the relational notations
早在 1871 年,对于正函数f ( n ) 和g ( n ),我们现在可以将其含义描述为g ( n ) = o ( f ( n )) (或f ( n ) = ω ( g ( n ))))。哈代关于“无限秩序”的有趣小册子(哈代,1924)通过使用关系来扩展这一点
already in 1871, for positive functions f(n) and g(n), with the meaning we can now describe as g(n) = o(f(n)) (or as f(n) = ω(g(n))). Hardy’s interesting tract on “Orders of Infinity” (Hardy, 1924) extends this by using also the relations
表示g ( n ) = O ( f ( n )) (或者等效地,f ( n ) = Ω ( g ( n )),因为我们假设f和g为正)。哈迪还写道
to mean g(n) = O(f(n)) (or, equivalently, f(n) = Ω(g(n)), since we are assuming that f and g are positive). Hardy also wrote
当g ( n ) = θ ( f ( n )) 时,并且
when g(n) = Θ(f(n)), and
当存在且既不是 0 也不是 ∞ 时;他写道
when exists and is neither 0 nor ∞; and he wrote
什么时候。(哈代的
符号一开始可能看起来很奇怪,直到你意识到他用它做了什么;例如,他证明了以下很好的定理:“如果f ( n ) 和g ( n ) 是从普通算术运算递归建立的任何函数以及 exp 和 log 函数,我们正好具有以下三种关系f ( n ) ≺ g ( n )、f ( n )
g ( n ) 或f ( n ) ≻ g ( n ) 之一。”)
when . (Hardy’s notation may seem peculiar at first, until you realize what he did with it; for example, he proved the following nice theorem: “If f(n) and g(n) are any functions built up recursively from the ordinary arithmetic operations and the exp and log functions, we have exactly one of the three relations f(n) ≺ g(n), f(n) g(n), or f(n) ≻ g(n).”)
多年来,哈代出色的记谱法已经变得有些扭曲。例如,Vinogradov (1954) 将f ( n ) ≼ g ( n ) 写为f ( n ) ≪ g ( n );因此,维诺格拉多夫对这个公式很满意
Hardy’s excellent notation has become somewhat distorted over the years. For example, Vinogradov (1954) writes f(n) ≪ g(n) instead of f(n) ≼ g(n); thus, Vinogradov is comfortable with the formula
而我不是。无论如何,这种关系符号具有直观清晰的传递属性,并且它们避免了使用单向等式,这让一些人感到困扰。那么,为什么它们不应该取代O以及新符号Ω和θ呢?
while I am not. In any event such relational notations have intuitively clear transitive properties, and they avoid the use of one-way equalities which bother some people. Why, then, should they not replace O and the new symbols Ω and Θ?
O如此方便的主要原因是我们可以在公式中间(以及英语句子中间,以及显示一系列相关算法的运行时间的表格等)中使用它。关系符号要求我们将除我们估计的函数之外的所有内容转置到方程的一侧。(参见 Prachar [1957,第 191 页]。)简单的推导,如
The main reason why O is so handy is that we can use it right in the middle of formulas (and in the middle of English sentences, and in tables which show the running times for a family of related algorithms, etc.). The relational notations require us to transpose everything but the function we are estimating to one side of an equation. (Cf. Prachar [1957, p. 191].) Simple derivations like
用关系表示法会非常麻烦。
would be extremely cumbersome in relational notation.
当我解决问题时,我的草稿纸笔记经常包含临时符号,并且我一直使用像“(≤5 n 2 )”这样的表达式来代表所有函数的集合,这些函数是≤ 5 n 2。同样,我可以写“( ∼ 5 n 2 )”来代表渐近于 5 n 2等的函数;因此,如果我将≼关系与可能为负的函数进行适当扩展, “( ≼ n 2 )”将等价于O ( n 2 )。这将为各种事物提供统一的符号约定,用于表达式中间,给出的不仅仅是上面提出的O、Ω和θ 。
When I am working on a problem, my scratch paper notes often contain ad hoc notations, and I have been using an expression like “(≤5n2)” to stand for the set of all functions which are ≤ 5n2. Similarly, I can write “(∼ 5n2)” to stand for functions which are asymptotic to 5n2, etc.; and “(≼n2)” would therefore be equivalent to O(n2), if I made appropriate extensions of the ≼ relation to functions which may be negative. This would provide a uniform notational convention for all sorts of things, for use in the middle of expressions, giving more than just the O and Ω and Θ proposed above.
尽管如此,我还是更喜欢用O、Ω和θ符号发表论文;仅当遇到需要的情况时,我才会使用其他符号,例如“( ∼ 5 n 2 )”。为什么?主要原因是O表示法如此普遍地建立和接受,我觉得用我自己发明的表示法“( ≼ f ( n ))”代替它是不正确的,无论逻辑上如何构思;O表示法现在已经具有重要的助记意义,我们对此感到满意。出于类似的原因,我并没有放弃十进制表示法,尽管我发现八进制(比如说)更符合逻辑。我喜欢Ω和θ符号,因为它们现在具有从O继承的助记意义。
In spite of this, I much prefer to publish papers with the O, Ω, and Θ notations; I would use other notations like “(∼ 5n2)” only when faced with a situation that needed it. Why? The main reason is that O-notation is so universally established and accepted, I would not feel right replacing it by a notation “(≼f(n))” of my own invention, however logically conceived; the O-notation has now assumed important mnemonic significance, and we are comfortable with it. For similar reasons, I am not abandoning decimal notation although I find that octal (say) is more logical. And I like the Ω and Θ notations because they now have mnemonic significance inherited from O.
好吧,我想我已经把这个问题解决了,因为不知道有其他论据支持或反对引入Ω和θ。根据此处讨论的问题,我建议 SIGACT 成员以及计算机科学和数学期刊的编辑采用上面定义的O、Ω和θ符号,除非可以很快找到更好的替代方案。此外,我建议在关系符号更合适的情况下采用 Hardy 的关系符号。
Well, I think I have beat this issue to death, knowing of no other arguments pro or con the introduction of Ω and Θ. On the basis of the issues discussed here, I propose that members of SIGACT, and editors of computer science and mathematics journals, adopt the O, Ω, and Θ notations as defined above, unless a better alternative can be found reasonably soon. Furthermore I propose that the relational notations of Hardy be adopted in those situations where a relational notation is more appropriate.
经 Donald E. Knuth 许可,转载自 Knuth (1976)。
Reprinted from Knuth (1976), with permission from Donald E. Knuth.
一位尊贵的同事认为,这篇论文表现出如此具有争议性的过度行为,因此它不应该出现在这个文集中。当它最初发表在会议论文集上时,引起了很大的争议,导致项目验证资金的流动显着放缓。尽管验证技术如今已广泛用于硬件设计,但大型软件系统的形式验证仍然很少见,原因甚至连 Hoare 似乎也承认了(参见第 298 页)。
A valued colleague believes this paper displays such polemical overreach that it should not appear in this collection. When published, originally in a conference proceedings, it caused such controversy that the flow of program verification funding slowed significantly. Though verification techniques are widely used today for hardware designs, formal verification of large software systems is still a rarity, for reasons even Hoare seems to have acknowledged (see page 298).
这篇论文触动了人们的神经,因为它要求反思计算机科学与数学的相似和不同之处,甚至反思整个数学的形式主义程序以及数学在实践中的工作方式。事实上,当提到这篇论文时,一些计算机科学家仍然感到愤怒,这证明了它的辩证力量,也证明了这个领域已经足够成熟,足以因深深的怨恨而分裂。争论持续了很长一段时间。十年后,詹姆斯·费泽(James Fetzer,1988)宣称德米洛、利普顿和玻璃市“为一些需要进一步阐述并值得更好支持的立场提供了一些糟糕的论据”,从而冒犯了双方。今天毫无疑问,即使霍尔的目标过于雄心勃勃,硬件设计和低级代码的形式验证不仅有用而且必不可少。2007 年图灵奖颁给了 Edmund Clarke、Allen Emerson 和 Joseph Sifakis,他们利用形式化模型检查作为硬件和软件验证的工具。
The paper hit a nerve because it demanded introspection about the ways in which computer science resembles, and differs from, mathematics—and indeed about the entire formalist program for mathematics and the way mathematics works in practice. The fact that some computer scientists still bristle when this paper is mentioned is testament to its dialectical force—and to the fact that the field had matured enough to be divided by deep resentments. The dispute continued for quite a while; a decade later, James Fetzer (1988) offended both sides by declaring that DeMillo, Lipton, and Perlis had offered “some bad arguments for some positions that need further elaboration and deserve better support.” There is little doubt today that, even if Hoare’s goals were excessively ambitious, the formal verification of hardware design and low-level code is not only useful but essential. The 2007 Turing Award went to Edmund Clarke, Allen Emerson, and Joseph Sifakis for harnessing formal model-checking as a tool for the verification of both hardware and software.
Alan Perlis(1922-1990)是一位编程语言先驱。1957 年,他担任美国-欧洲联合委员会主席,设计了一种算法语言,该语言最终成为 A LGOL 60,它是所有块结构命令式算法语言的先驱。1966 年,他是第一位图灵奖获得者。玻璃市还因其在建立计算机科学作为一门学科方面所发挥的作用而被人们铭记。
Alan Perlis (1922–1990) was a programming language pioneer. In 1957 he chaired the joint U.S.–European committee to design an algorithmic language, which ultimately became ALGOL 60, the precursor of all block-structured, imperative algorithmic languages. He was the first recipient of the Turing Award, in 1966. Perlis is also remembered for his role in establishing computer science as an academic discipline.
玻璃市在现在的卡内基梅隆大学创建了计算机科学系;它成为全国领先的部门之一。1971 年,他加入了耶鲁大学新成立的计算机科学系,并作为系主任领导了该系在 20 世纪 70 年代末的发展。他在这两所机构中率先将计算机科学定位为一门独立的学科,为其他学院和大学提供了具有广泛影响力的模式。
Perlis founded the computer science department at what is now Carnegie Mellon University; it became one of the nation’s leading departments. In 1971 he joined the new computer science department at Yale and as chair led its growth in the late 1970s. The lead he set in positioning computer science as an intellectually independent discipline in these two institutions provided widely influential models for other colleges and universities.
理查德·德米洛(Richard DeMillo,生于 1947 年)和理查德·利普顿(Richard Lipton,生于 1948 年)是长期的科学合作者。在撰写本文时,利普顿在玻璃市的耶鲁大学。德米洛和利普顿现在都在佐治亚理工学院工作,德米洛领导着一个研究高等教育未来的中心。
Richard DeMillo (b. 1947) and Richard Lipton (b. 1948) are longtime scientific collaborators. At the time the paper was written, Lipton was at Yale, with Perlis; DeMillo and Lipton are both now at Georgia Tech, where DeMillo is leading a center on the future of higher education.
我想问笛卡尔提出的同样的问题。您提议对逻辑正确性给出一个精确的定义,这与我对逻辑正确性的模糊直觉感觉是一样的。你打算如何证明它们是相同的?……普通数学家不应该忘记直觉是最终的权威。
——J。巴克利·罗瑟
I should like to ask the same question that Descartes asked. You are proposing to give a precise definition of logical correctness which is to be the same as my vague intuitive feeling for logical correctness. How do you intend to show that they are the same? … The average mathematician should not forget that intuition is the final authority.
—J. Barkley Rosser
许多人认为计算机编程应该努力变得更像数学。也许是这样,但不是他们想象的那样。程序验证的目的是使编程更加数学化,其目的是显着提高人们对软件正确运行的信心,而验证者用来实现这一目标的手段是一长串形式化、演绎式的过程。逻辑。在数学中,目标是增加人们对定理正确性的信心,理论上数学家可以用来实现这一目标的手段之一确实是一长串形式逻辑。但事实上他们没有。他们使用的是一种非常不同的动物作为证据。证据也不能解决问题;与它的名字所暗示的相反,证明只是迈向自信的一步。我们相信,归根结底,是一个社会过程决定了数学家是否对某个定理充满信心——而且我们相信,由于程序验证者之间无法发生类似的社会过程,程序验证必然会失败。我们看不出它将如何影响任何人对项目的信心。
MANY people have argued that computer programming should strive to become more like mathematics. Maybe so, but not in the way they seem to think. The aim of program verification, an attempt to make programming more mathematics-like, is to increase dramatically one’s confidence in the correct functioning of a piece of software, and the device that verifiers use to achieve this goal is a long chain of formal, deductive logic. In mathematics, the aim is to increase one’s confidence in the correctness of a theorem, and it’s true that one of the devices mathematicians could in theory use to achieve this goal is a long chain of formal logic. But in fact they don’t. What they use is a proof, a very different animal. Nor does the proof settle the matter; contrary to what its name suggests, a proof is only one step in the direction of confidence. We believe that, in the end, it is a social process that determines whether mathematicians feel confident about a theorem—and we believe that, because no comparable social process can take place among program verifiers, program verification is bound to fail. We can’t see how it’s going to be able to affect anyone’s confidence about programs.
局外人认为数学是一个冷酷的、形式化的、逻辑性的、机械的、纯粹智力的整体过程。我们认为,只要数学是成功的,它就是一个社会的、非正式的、直观的、有机的、人类的过程,一个社区项目。在数学界,伯特兰·罗素和大卫·希尔伯特在本世纪初阐述了数学作为逻辑和形式的观点。他们认为数学原则上是从公理或假设逐步发展到定理,每一步都可以通过严格的变换规则轻松地从其前身中证明,变换规则很少且固定。《数学原理》是形式主义者的最高成就。这也是对形式主义观点的致命一击。这里并不矛盾:罗素确实成功地证明了普通的工作证明可以简化为正式的、象征性的推论。但他在三本巨著、繁重的著作中未能超越算术的基本事实。他展示了原则上可以做的事情和实践中不能做的事情。如果数学过程确实是严格的逻辑级数之一,我们仍然会用手指数数。
Outsiders see mathematics as a cold, formal, logical, mechanical, monolithic process of sheer intellection; we argue that insofar as it is successful, mathematics is a social, informal, intuitive, organic, human process, a community project. Within the mathematical community, the view of mathematics as logical and formal was elaborated by Bertrand Russell and David Hilbert in the first years of this century. They saw mathematics as proceeding in principle from axioms or hypotheses to theorems by steps, each step easily justifiable from its predecessors by a strict rule of transformation, the rules of transformation being few and fixed. The Principia Mathematica was the crowning achievement of the formalists. It was also the deathblow for the formalist view. There is no contradiction here: Russell did succeed in showing that ordinary working proofs can be reduced to formal, symbolic deductions. But he failed, in three enormous, taxing volumes, to get beyond the elementary facts of arithmetic. He showed what can be done in principle and what cannot be done in practice. If the mathematical process were really one of strict, logical progression, we would still be counting on our fingers.
事实上,每一位数学家都知道,如果一个人除了一步一步地验证该证明所构成的推论的正确性之外什么也没做,并且没有试图清楚地了解导致该证明的思想,那么该证明还没有被“理解”。这一特定扣除链的构建优先于其他所有扣除链。
——N。布尔巴基
Indeed every mathematician knows that a proof has not been “understood” if one has done nothing more than verify step by step the correctness of the deductions of which it is composed and has not tried to gain a clear insight into the ideas which have led to the construction of this particular chain of deductions in preference to every other one.
—N. Bourbaki
如果我看起来说的是实话,请同意我的观点。——苏格拉底
Agree with me if I seem to speak the truth. —Socrates
Stanislaw Ulam 估计数学家每年发布 200,000 个定理。其中一些随后遭到矛盾或以其他方式不允许,另一些则受到质疑,而大多数则被忽视。只有一小部分能够被任何相当大的数学家群体所理解和相信。
Stanislaw Ulam estimates that mathematicians publish 200,000 theorems every year. A number of these are subsequently contradicted or otherwise disallowed, others are thrown into doubt, and most are ignored. Only a tiny fraction come to be understood and believed by any sizable group of mathematicians.
被忽视或质疑的定理很少是疯子或无能者的成果。Kempe (1879) 发表了四色猜想的证明,该证明在 Heawood (1890) 发现推理中的致命缺陷之前持续了十一年。哈代和利特尔伍德之间的首次合作发表了一篇论文,发表于 1911 年 6 月伦敦数学会会议上;这篇论文从未发表,因为他们随后发现他们的证明是错误的(Bateman 和 Diamond,1978)。柯西、拉姆尔和库默都曾一度认为他们已经证明了费马大定理(Davis,1972)。1945年,拉德马赫认为他已经解决了黎曼猜想;他的结果不仅在数学界流传,还发表在《时代》杂志上(戴维斯,1972)。
The theorems that get ignored or discredited are seldom the work of crackpots or incompetents. Kempe (1879) published a proof of the four-color conjecture that stood for eleven years before Heawood (1890) uncovered a fatal flaw in the reasoning. The first collaboration between Hardy and Littlewood resulted in a paper they delivered at the June 1911 meeting of the London Mathematical Society; the paper was never published because they subsequently discovered that their proof was wrong (Bateman and Diamond, 1978). Cauchy, Lamr, and Kummer all thought at one time or another that they had proved Fermat’s Last Theorem (Davis, 1972). In 1945, Rademacher thought he had solved the Riemann Hypothesis; his results not only circulated in the mathematical world but were announced in Time magazine (Davis, 1972).
最近,我们发现了以下一组脚注,附加在集合论中一些独立性结果的简要历史概述中(Jech,1973):
Recently we found the following group of footnotes appended to a brief historical sketch of some independence results in set theory (Jech, 1973):
1. 第11题的结果与Levy公布的结果相矛盾。不幸的是,那里的建筑无法完成。
1. The result of Problem 11 contradicts the results announced by Levy. Unfortunately, the construction presented there cannot be completed.
2. Marek 也声称向 ZF 转让,但概述的方法似乎并不令人满意,尚未公布。
2. The transfer to ZF was also claimed by Marek but the outlined method appears to be unsatisfactory and has not been published.
3. Truss 宣布了一个矛盾的结果,但随后又撤回了这一结果。
3. A contradicting result was announced and later withdrawn by Truss.
4.问题22中的例子是Mostowski另一个条件的反例,Mostowski猜想了它的充分性,并挑出这个例子作为测试用例。
4. The example in Problem 22 is a counterexample to another condition of Mostowski, who conjectured its sufficiency and singled out this example as a test case.
5. 独立性结果与Feigner关于共尾性原则隐含选择公理的主张相矛盾。Morris 发现了一个错误(参见 Feigner 对 [1969] 的更正)。
5. The independence result contradicts the claim of Feigner that the Cofinality Principle implies the Axiom of Choice. An error has been found by Morris (see Feigner’s corrections to [1969]).
作者没有恶意;他可能从未听说过当前编程界的争议;显然,他并不关心让他的朋友和同事受到蔑视。如果不描述证明中连续的社会过程,就根本无法描述数学思想的历史。问题不在于数学家会犯错误,而在于数学家会犯错误。那不用说了。关键是数学家的错误不是通过形式符号逻辑而是通过其他数学家来纠正的。
The author has no axe to grind; he has probably never even heard of the current controversy in programming; and it is clearly no part of his concern to hold his friends and colleagues up to scorn. There is simply no way to describe the history of mathematical ideas without describing the successive social processes at work in proofs. The point is not that mathematicians make mistakes; that goes without saying. The point is that mathematicians’ errors are corrected, not by formal symbolic logic, but by other mathematicians.
仅仅增加研究特定问题的数学家数量并不一定能确保证明可信。最近,两个独立的拓扑学家小组,一个是美国人,另一个是日本人,独立地宣布了关于同种拓扑对象(称为同伦群)的结果。结果证明是矛盾的,而且由于这两个证明都涉及复杂的符号和数值计算,所以根本不清楚谁犯了错误。但赌注足够高,有理由紧迫解决这个问题,因此日本和美国交换了证据。显然,每个小组都非常积极地发现对方证明中的错误。显然,其中一个证明是不正确的。但日本和美国的证据都不容置疑。随后,第三组研究人员获得了另一个证据,这次支持了美国的结果。现在证据的分量与他们的证据相悖,日本人已经退休以进一步考虑这个问题。
Just increasing the number of mathematicians working on a given problem does not necessarily insure believable proofs. Recently, two independent groups of topologists, one American, the other Japanese, independently announced results concerning the same kind of topological object, a thing called a homotopy group. The results turned out to be contradictory, and since both proofs involved complex symbolic and numerical calculation, it was not at all evident who had goofed. But the stakes were sufficiently high to justify pressing the issue, so the Japanese and American proofs were exchanged. Obviously, each group was highly motivated to discover an error in the other’s proof; obviously, one proof or the other was incorrect. But neither the Japanese nor the American proof could be discredited. Subsequently, a third group of researchers obtained yet another proof, this time supporting the American result. The weight of the evidence now being against their proof, the Japanese have retired to consider the matter further.
这个故事实际上有两个寓意。首先,证明本身并不能显着提高我们对其所要证明的定理的可能正确性的信心。事实上,对于同伦群的定理,所有提供的证明的可怕性表明该定理本身需要重新思考。第二点是,完全由计算组成的证明不一定是正确的。
There are actually two morals to this story. First, a proof does not in itself significantly raise our confidence in the probable truth of the theorem it purports to prove. Indeed, for the theorem about the homotopy group, the horribleness of all the proffered proofs suggests that the theorem itself requires rethinking. A second point to be made is that proofs consisting entirely of calculations are not necessarily correct.
即使简单、清晰和容易也不能保证证明是正确的。试图证明平行公设的历史是一个特别丰富的来源,其中有很多可爱而简洁的证明,但结果证明是错误的。从托勒密到勒让德(他一次又一次地尝试),各个时代最伟大的几何学家都不断地用头来反对欧几里得的第五条公设。更糟糕的是,尽管我们现在知道该公设是不可证明的,但许多错误的证明仍然如此令人迷惑,以至于在希斯对欧几里得的权威评论(Euclid,1956)中,它们不允许单独存在;希思用斜体、脚注和解释性旁注对它们进行标记,以免一些年轻的数学家在翻阅这本书时被误导。
Even simplicity, clarity, and ease provide no guarantee that a proof is correct. The history of attempts to prove the Parallel Postulate is a particularly rich source of lovely, trim proofs that turned out to be false. From Ptolemy to Legendre (who tried time and time again), the greatest geometricians of every age kept ramming their heads against Euclid’s fifth postulate. What’s worse, even though we now know that the postulate is indemonstrable, many of the faulty proofs are still so beguiling that in Heath’s definitive commentary on Euclid (Euclid, 1956) they are not allowed to stand alone; Heath marks them up with italics, footnotes, and explanatory marginalia, lest some young mathematician, thumbing through the volume, be misled.
证明充其量只能表达真理,这一观点与最近的一场数学争论有着有趣的联系。在最近一期《科学》杂志上,Kolata(1976)提出,数学证明的表面上安全的概念可能需要修订。这里的中心问题不是“定理如何被相信?” 但是“当我们相信一个定理时,我们到底相信什么?” 有两种相关的观点,可以粗略地称为经典观点和概率观点。
The idea that a proof can, at best, only probably express truth makes an interesting connection with a recent mathematical controversy. In a recent issue of Science, Kolata (1976) suggested that the apparently secure notion of mathematical proof may be due for revision. Here the central question is not “How do theorems get believed?” but “What is it that we believe when we believe a theorem?” There are two relevant views, which can be roughly labeled classical and probabilistic.
古典主义者说,当一个人相信数学陈述A时,就相信原则上存在一种正确的、形式化的、有效的、逐步的、语法上可检查的演绎,在合适的逻辑演算(例如策梅洛-弗兰克尔集合论或皮亚诺算术)中导致A , A à la the Principia的演绎,这种演绎将A的真理完全形式化为二元的、亚里士多德的真理概念:“如果一个命题说的是什么,那么它就是,并且如果它说的是什么,那么它就是真的。不是,事实不是。” [编辑:“说存在的东西不存在,或者说不存在的东西存在,都是错误的;但说存在的东西是,不存在的东西不是,这是真的”(亚里士多德,1933,IV.7)。]这种形式的推理链绝不是与日常的普通数学证明相同的东西。经典观点不要求普通证明附有其形式对应物;相反,有数学上合理的理由允许众神将我们的大部分论点形式化。例如,一位理论家估计,假设集合论和初等分析的拉马努金猜想之一的正式论证需要大约两千页;从第一原理进行演绎的长度几乎是不可想象的(Manin,1977)。但古典主义者认为,形式化原则上是一种可能性,它所表达的真理是二元的,要么是这样,要么不是。
The classicists say that when one believes mathematical statement A, one believes that in principle there is a correct, formal, valid, step by step, syntactically checkable deduction leading to A in a suitable logical calculus such as Zermelo-Fraenkel set theory or Peano arithmetic, a deduction of A à la the Principia, a deduction that completely formalizes the truth of A in the binary, Aristotelian notion of truth: “A proposition is true if it says of what is, that it is, and if it says of what is not, that it is not.” [EDITOR: “To say that what is is not, or that what is not is, is false; but to say that what is is, and what is not is not, is true” (Aristotle, 1933, IV.7).] This formal chain of reasoning is by no means the same thing as an everyday, ordinary mathematical proof. The classical view does not require that an ordinary proof be accompanied by its formal counterpart; on the contrary, there are mathematically sound reasons for allowing the gods to formalize most of our arguments. One theoretician estimates, for instance, that a formal demonstration of one of Ramanujan’s conjectures assuming set theory and elementary analysis would take about two thousand pages; the length of a deduction from first principles is nearly inconceivable (Manin, 1977). But the classicist believes that the formalization is in principle a possibility and that the truth it expresses is binary, either so or not so.
概率主义者认为,既然任何很长的证明最多只能被视为可能正确,那么为什么不概率性地陈述定理并给出概率性证明呢?概率证明可能具有双重优势,即在技术上比经典的二价证明更容易,并且可能允许数学家分离出在传统的二元证明中引起不确定性的关键思想。这个过程甚至可能会带来更合理的经典证明。概率论方法的一个例子是 Michael Rabin 的测试可能素性的算法(Rabin,1976)。对于非常大的整数N,所有用于确定N是否为合数的经典技术都变得不起作用。即使使用最巧妙的编程,确定大于 10 10 4 的数字是否为素数所需的计算也需要大量的计算时间。拉宾的见解是,如果您愿意满足于N是质数(或非质数)的极高概率,那么您就可以在合理的时间内获得它,并且错误概率极小。
The probabilists argue that since any very long proof can at best be viewed as only probably correct, why not state theorems probabilistically and give probabilistic proofs? The probabilistic proof may have the dual advantage of being technically easier than the classical, bivalent one, and may allow mathematicians to isolate the critical ideas that give rise to uncertainty in traditional, binary proofs. This process may even lead to a more plausible classical proof. An illustration of the probabilist approach is Michael Rabin’s algorithm for testing probable primality (Rabin, 1976). For very large integers N, all of the classical techniques for determining whether N is composite become unworkable. Using even the most clever programming, the calculations required to determine whether numbers larger than 10104 are prime require staggering amounts of computing time. Rabin’s insight was that if you are willing to settle for a very good probability that N is prime (or not prime), then you can get it within a reasonable amount of time—and with vanishingly small probability of error.
考虑到什么构成可接受的证明的这些不确定性,这毕竟是数学过程中相当基本的要素,那么数学是如何幸存下来并如此成功的呢?如果证明与正式的演绎推理几乎没有什么相似之处,如果它们可以经受几代人的考验然后失败,如果它们可以包含无法检测的缺陷,如果它们只能表达一定误差范围内的真理概率——如果它们实际上是,无法在保证定理超越概率的意义上证明它们,如果必要的话,超越洞察力,那么,数学是如何运作的呢?它如何成功地提出重要且令人信服的定理?
In view of these uncertainties over what constitutes an acceptable proof, which is after all a fairly basic element of the mathematical process, how is it that mathematics has survived and been so successful? If proofs bear little resemblance to formal deductive reasoning, if they can stand for generations and then fall, if they can contain flaws that defy detection, if they can express only the probability of truth within certain error bounds–if they are, in fact, not able to prove theorems in the sense of guaranteeing them beyond probability and, if necessary, beyond insight, well, then, how does mathematics work? How does it succeed in developing theorems that are significant and that compel belief?
首先,定理的证明是一条消息。证明并不是一个独立存在的美丽的抽象对象。没有哪个数学家掌握了一个证明,坐下来,并因知道他现在可以确定他的定理的正确性而高兴地叹息。他跑进大厅,想找人听听。他冲进同事的办公室,霸占了黑板。他抛开了预定的话题,用他的新想法来举办研讨会。他把他的研究生从论文中拉开来听。他拿起电话告诉了德克萨斯州和多伦多的同事。在其第一个形式中,证明是口头消息,或者至多是黑板或餐巾纸上的草图。
First of all, the proof of a theorem is a message. A proof is not a beautiful abstract object with an independent existence. No mathematician grasps a proof, sits back, and sighs happily at the knowledge that he can now be certain of the truth of his theorem. He runs out into the hall and looks for someone to listen to it. He bursts into a colleague’s office and commandeers the blackboard. He throws aside his scheduled topic and regales a seminar with his new idea. He drags his graduate students away from their dissertations to listen. He gets onto the phone and tells his colleagues in Texas and Toronto. In its first incarnation, a proof is a spoken message, or at most a sketch on a chalkboard or a paper napkin.
口头阶段是证明的第一个过滤器。如果它没有在他的朋友中引起兴奋或信任,明智的数学家就会重新考虑它。但如果他们发现它相当有趣且可信,他就会把它写下来。在草稿流传一段时间后,如果看起来仍然合理,他会制作一个完善的版本并提交出版。如果审稿人也认为它有吸引力且令人信服,它就会被出版,以便更广泛的读者可以阅读。如果广大读者中有足够多的人相信它并喜欢它,那么在一段适当的冷静期之后,评论出版物会采取更悠闲的态度,看看这个证据是否真的像它最初出现的那样令人愉悦,以及在冷静考虑后,他们真的相信这一点。
That spoken stage is the first filter for a proof. If it generates no excitement or belief among his friends, the wise mathematician reconsiders it. But if they find it tolerably interesting and believable, he writes it up. After it has circulated in draft for a while, if it still seems plausible, he does a polished version and submits it for publication. If the referees also find it attractive and convincing, it gets published so that it can be read by a wider audience. If enough members of that larger audience believe it and like it, then after a suitable cooling-off period the reviewing publications take a more leisurely look, to see whether the proof is really as pleasing as it first appeared and whether, on calm consideration, they really believe it.
当一个证据被相信时会发生什么?最直接的过程可能是结果的内化。也就是说,阅读并相信证明的数学家会尝试解释它,用自己的术语表达它,使其符合他自己对数学知识的个人看法。没有两个数学家可能以完全相同的方式内化一个数学概念,因此这个过程通常会导致同一定理的多个版本,每个版本都强化了信念,每个版本都增加了数学界的感觉:声明很可能是真的。例如,高斯获得了至少六个关于他的“二次互反定律”的独立证明。迄今为止,已知有五十多个该定律的证据。伊姆雷·拉卡托斯(Imre Lakatos)在他的《证明与反驳》(Lakatos,1976)中对几个著名定理从最初的概念到普遍接受所经历的转变进行了历史性的准确讨论。拉卡托斯证明,欧拉公式V − E + F = 2 在首次陈述后的近 200 年间被一次又一次地重新表述,直到最终达到目前的稳定形式。可能发生的最引人注目的转变是泛化。如果通过与原始定理相同的社会过程,广义定理被相信,那么原始陈述的可信度就会大大提高。
And what happens to a proof when it is believed? The most immediate process is probably an internalization of the result. That is, the mathematician who reads and believes a proof will attempt to paraphrase it, to put it in his own terms, to fit it into his own personal view of mathematical knowledge. No two mathematicians are likely to internalize a mathematical concept in exactly the same way, so this process leads usually to multiple versions of the same theorem, each reinforcing belief, each adding to the feeling of the mathematical community that the original statement is likely to be true. Gauss, for example, obtained at least half a dozen independent proofs of his “law of quadratic reciprocity”; to date over fifty proofs of this law are known. Imre Lakatos gives, in his Proofs and Refutations (Lakatos, 1976), historically accurate discussions of the transformations that several famous theorems underwent from initial conception to general acceptance. Lakatos demonstrates that Euler’s formula V− E + F = 2 was reformulated again and again for almost two hundred years after its first statement, until it finally reached its current stable form. The most compelling transformation that can take place is generalization. If, by the same social process that works on the original theorem, the generalized theorem comes to be believed, then the original statement gains greatly in plausibility.
一个可信的定理被使用。在较大的证明中,它可能会显示为引理;如果它不导致矛盾,那么我们就更愿意相信它。或者工程师可以通过将物理值插入其中来使用它。我们对经典应力方程有相当高的信心,因为我们看到了矗立的桥梁;我们对流体力学的基本定理有一定的信心,因为我们看到飞机会飞。
A believable theorem gets used. It may appear as a lemma in larger proofs; if it does not lead to contradictions, then we are all the more inclined to believe it. Or engineers may use it by plugging physical values into it. We have fairly high confidence in classical stress equa- tions because we see bridges that stand; we have some confidence in the basic theorems of fluid mechanics because we see airplanes that fly.
可信的结果有时会与数学的其他领域产生联系——重要的领域总是如此。一个定理或证明技术从一个数学分支成功转移到另一个数学分支会增加我们对其的信心。例如,1964 年,Paul Cohen 使用一种称为“强迫”的技术来证明集合论中的定理(Cohen,1963);当时,他的想法非常激进,以至于他的证明很难被理解。但随后其他研究人员在代数背景下解释了强迫的概念,将其与更熟悉的逻辑思想联系起来,概括了这些概念,并发现这些概括是有用的。所有这些联系(以及导致接受的其他正常社会过程)使得强制交易的想法变得更加引人注目,而今天强制是集合论中研究生的常规研究。
Believable results sometimes make contact with other areas of mathematics—important ones invariably do. The successful transfer of a theorem or a proof technique from one branch of mathematics to another increases our feeling of confidence in it. In 1964, for example, Paul Cohen used a technique called forcing to prove a theorem in set theory (Cohen, 1963); at that time, his notions were so radical that the proof was hardly understood. But subsequently other investigators interpreted the notion of forcing in an algebraic context, connected it with more familiar ideas in logic, generalized the concepts, and found the generalizations useful. All of these connections (along with the other normal social processes that lead to acceptance) made the idea of forcing a good deal more compelling, and today forcing is routinely studied by graduate students in set theory.
经过足够的内化、足够的转化、足够的泛化、足够的使用和足够的联系,数学界最终决定,原来的定理中的中心概念,现在可能已经发生了很大的变化,具有最终的稳定性。如果各种证明感觉正确并且从足够的角度检验结果,那么该定理的真实性最终被认为是成立的。该定理被认为在经典意义上是正确的,也就是说,它可以通过形式的演绎逻辑来证明,尽管对于几乎所有定理来说,这样的演绎从未发生过或永远不会发生。
After enough internalization, enough transformation, enough generalization, enough use, and enough connection, the mathematical community eventually decides that the central concepts in the original theorem, now perhaps greatly changed, have an ultimate stability. If the various proofs feel right and the results are examined from enough angles, then the truth of the theorem is eventually considered to be established. The theorem is thought to be true in the classical sense—that is, in the sense that it could be demonstrated by formal, deductive logic, although for almost all theorems no such deduction ever took place or ever will.
因为清晰易懂的事物具有吸引力;复杂的令人排斥。——大卫·希尔伯特
For what is clear and easily comprehended attracts; the complicated repels. —David Hilbert
有时一个人必须说一些困难的事情,但一个人应该尽可能简单地说出来。——GH·哈迪
Sometimes one has to say difficult things, but one ought to say them as simply as one knows how. —G. H. Hardy
一般来说,最重要的数学问题都是清晰且易于表述的。一个重要的定理更有可能采用A形式而不是B形式。
As a rule, the most important mathematical problems are clean and easy to state. An important theorem is much more likely to take form A than form B.
几个世纪以来最让数学家着迷、最痛苦、最高兴的问题都是最容易表述的问题。爱因斯坦认为,一个科学理论的成熟程度可以通过它向街上的人解释得如何来判断。四色定理建立在如此薄弱的基础之上,以至于它可以完全精确地向孩子陈述。如果孩子学会了乘法表,他就能理解素数的位置和分布问题。对定义“数”概念的问题的深深着迷可能会让他成为一名数学家。
The problems that have most fascinated and tormented and delighted mathematicians over the centuries have been the simplest ones to state. Einstein held that the maturity of a scientific theory could be judged by how well it could be explained to the man on the street. The four-color theorem rests on such slender foundations that it can be stated with complete precision to a child. If the child has learned his multiplication tables, he can understand the problem of the location and distribution of the prime numbers. And the deep fascination of the problem of defining the concept of “number” might turn him into a mathematician.
重要性和简单性之间的相关性并非偶然。简单、有吸引力的定理是最有可能被听到、读到、内化和使用的定理。数学家使用简单性作为证明的第一个测试。只有乍一看很有趣,他们才会仔细考虑。数学家不是利他主义的受虐狂。相反,数学史是一部对轻松、愉悦和优雅的长期探索——当然是在符号领域。
The correlation between importance and simplicity is no accident. Simple, attractive theorems are the ones most likely to be heard, read, internalized, and used. Mathematicians use simplicity as the first test for a proof. Only if it looks interesting at first glance will they consider it in detail. Mathematicians are not altruistic masochists. On the contrary, the history of mathematics is one long search for ease and pleasure and elegance—in the realm of symbols, of course.
即使数学家不愿意,他们也必须使用简单性标准;从心理上来说,除了200,000个候选者中最简单和最有吸引力的一个之外,不可能选择任何一个来引起人们的注意。如果数学中存在不简单的重要基本概念,数学家可能永远不会发现它们。凌乱、丑陋的数学命题只适用于微不足道的结构类,特殊的命题,依赖于极其昂贵的数学机器的命题,需要五块黑板或一卷纸巾才能绘制的命题——这些不太可能被吸收到身体中数学。然而,只有通过这种同化,证据才具有可信度。证据本身并不算什么;只有当它经历了数学界的社会过程时,它才变得可信。
Even if they didn’t want to, mathematicians would have to use the criterion of simplicity; it is a psychological impossibility to choose any but the simplest and most attractive of 200,000 candidates for one’s attention. If there are important, fundamental concepts in mathematics that are not simple, mathematicians will probably never discover them. Messy, ugly mathematical propositions that apply only to paltry classes of structures, idiosyncratic propositions, propositions that rely on inordinately expensive mathematical machinery, propositions that require five blackboards or a roll of paper towels to sketch—these are unlikely ever to be assimilated into the body of mathematics. And yet it is only by such assimilation that proofs gain believability. The proof by itself is nothing; only when it has been subjected to the social processes of the mathematical community does it become believable.
在本文中,我们倾向于强调简单性高于一切,因为这是任何证明的第一个过滤器。但我们不想把自己和我们的数学家同行描绘成市侩或畜生。一旦一个想法满足了简单性的标准,其他标准就会帮助确定它在让数学家抽象地凝视远方的想法中的位置。尤里·马宁(Yuri Manin)说得最好:一个好的证据可以让我们变得更明智。
In this paper, we have tended to stress simplicity above all else because that is the first filter for any proof. But we do not wish to paint ourselves and our fellow mathematicians as philistines or brutes. Once an idea has met the criterion of simplicity, other standards help determine its place among the ideas that make mathematicians gaze off abstractedly into the distance. Yuri Manin has put it best: A good proof is one that makes us wiser.
相反,我发现发现者除了枷锁之外什么也没有。它对我们的简洁性没有任何帮助,甚至远非如此。如果需要 27 个方程才能证明 1 是一个数,那么需要多少个方程才能证明一个真正的定理?——亨利·庞加莱
On the contrary, I find nothing in logistic for the discoverer but shackles. It does not help us at all in the direction of conciseness, far from it; and if it requires twenty-seven equations to establish that 1 is a number, how many will it require to demonstrate a real theorem? —Henri Poincaré
数学家作为科学家顾问的主要职责之一……就是阻止他们对数学抱有过高的期望。——诺伯特·维纳
One of the chief duties of the mathematician in acting as an advisor to scientists … is to discourage them from expecting too much from mathematics. —Norbert Wiener
只有当数学陈述受到数学界的社会机制的影响之后,数学证明才能增加我们对数学陈述真实性的信心。这些相同的机制注定了所谓的软件证明的失败,即冗长的形式验证,它不是与有效的数学证明相对应,而是与数学家用来描述他的信念感觉的想象的逻辑结构相对应。验证不是消息;而是消息。一个跑到大厅传达他最新证实的人很快就会发现自己成为社会贱民。验证无法真正被读取;读者可以凭借英勇的努力读完一篇较短的文章,但这不是阅读。验证是不可读的,而且从字面上来说是难以言说的,它不能被内化、转化、概括、使用、与其他学科联系起来,并最终融入社区意识。它们无法像数学定理那样逐渐获得可信度;人们要么盲目地相信它们,作为一种纯粹的信仰行为,要么根本不相信它们。
Mathematical proofs increase our confidence in the truth of mathematical statements only after they have been subjected to the social mechanisms of the mathematical community. These same mechanisms doom the so-called proofs of software, the long formal verifications that correspond, not to the working mathematical proof, but to the imaginary logical structure that the mathematician conjures up to describe his feeling of belief. Verifications are not messages; a person who ran out into the hall to communicate his latest verification would rapidly find himself a social pariah. Verifications cannot really be read; a reader can flay himself through one of the shorter ones by dint of heroic effort, but that’s not reading. Being unreadable and—literally—unspeakable, verifications cannot be internalized, transformed, generalized, used, connected to other disciplines, and eventually incorporated into a community consciousness. They cannot acquire credibility gradually, as a mathematical theorem does; one either believes them blindly, as a pure act of faith, or not at all.
在这一点上,一些验证的拥护者承认与数学的类比是失败的。他们认为A(编程)类似于B(数学),并且随后了解到B与他们想象的完全不同,因此他们希望论证A类似于B ',即他们神话版本的B。然后我们发现自己处于一个特殊的位置,提出了最初属于他们的论点,断言是的,确实,A确实类似于B;然而,我们的论点与他们的不同。(参见图 44.1和44.2。)
At this point, some adherents of verification admit that the analogy to mathematics fails. Having argued that A, programming, resembles B, mathematics, and having subsequently learned that B is nothing like what they imagined, they wish to argue instead that A is like B′, their mythical version of B. We then find ourselves in the peculiar position of putting across the argument that was originally theirs, asserting that yes, indeed, A does resemble B; our argument, however, matches the terms up differently from theirs. (See Figures 44.1 and 44.2.)
图44.1: 验证者的原始类比
Figure 44.1: The verifiers’ original analogy
图 44.2: 我们的类比
Figure 44.2: Our analogy
希望放弃明喻并用B ′ 代替的验证者也应该放弃B的语言,以帮助理解——特别是,如果他们不将他们的验证称为“证明”,这将会有所帮助。至于我们自己,我们将继续认为编程就像数学,数学证明中同样的社会过程注定会导致验证。
Verifiers who wish to abandon the simile and substitute B′ should as an aid to understanding abandon the language of B as well—in particular, it would help if they did not call their verifications “proofs.” As for ourselves, we will continue to argue that programming is like mathematics, and that the same social processes that work in mathematical proofs doom verifications.
对核查有一个基本的逻辑反对意见,这种反对意见本身具有形式主义的严格性。由于对计划的要求是非正式的,而计划是正式的,因此必须有一个过渡,而过渡本身也必然是非正式的。我们苦恼地发现,这个对我们来说似乎不言而喻的命题却存在争议。因此,我们应该强调,作为反形式主义者,我们不会反对基于这些理由的验证;我们只是想知道这个本质上非正式的步骤如何符合形式主义的观点。核查的拥护者是否忽视了他们所处理的正式对象的非正式起源?他们断言他们的形式化在某种程度上是无可争议的吗?我们必须承认我们的困惑和沮丧。
There is a fundamental logical objection to verification, an objection on its own ground of formalistic rigor. Since the requirement for a program is informal and the program is formal, there must be a transition, and the transition itself must necessarily be informal. We have been distressed to learn that this proposition, which seems self-evident to us, is controversial. So we should emphasize that as antiformalists, we would not object to verification on these grounds; we only wonder how this inherently informal step fits into the formalist view. Have the adherents of verification lost sight of the informal origins of the formal objects they deal with? Is it their assertion that their formalizations are somehow incontrovertible? We must confess our confusion and dismay.
然后还有另一个逻辑困难,几乎和上面的困难一样基本,但绝不像上面的困难那么令人吹毛求疵:只有当规范和程序是独立导出时,程序与其规范一致的正式证明才有价值。在实验验证的玩具程序气氛中,这个标准很容易满足。但在现实生活中,如果一个程序在设计过程中失败了,它就会被改变,而改变是基于对其规范的了解;或者规格被更改,并且这些更改基于通过失败获得的程序知识。无论哪种情况,都不再满足具有独立标准来相互检查的要求。再次,我们希望没有人建议在设计过程中不要反复修改程序和规范。那将是一种令人难以置信的贫困——我们担心,这种贫困确实是由于对形式逻辑的迷恋造成的。
Then there is another logical difficulty, nearly as basic, and by no means so hair-splitting as the one above: The formal demonstration that a program is consistent with its specifications has value only if the specifications and the program are independently derived. In the toy-program atmosphere of experimental verification, this criterion is easily met. But in real life, if during the design process a program fails, it is changed, and the changes are based on knowledge of its specifications; or the specifications are changed, and those changes are based on knowledge of the program gained through the failure. In either case, the requirement of having independent criteria to check against each other is no longer met. Again, we hope that no one would suggest that programs and specifications should not be repeatedly modified during the design process. That would be a position of incredible poverty—the sort of poverty that does, we fear, result from infatuation with formal logic.
回到现实世界,生产软件附带的输入/输出规范很少是简单的。它们往往又长又复杂又奇特。举一个极端的例子,计算法国国家铁路的工资单需要超过 3,000 个工资率(一个上坡,一个下坡,等等)。任何合理的编译器或操作系统的规范都会充满大量内容,但没有人相信它们是完整的。甚至在某些情况下,黑盒代码、数值算法可以被证明是有效的,就像它们被用来建造真正的飞机或钻探真正的油井一样,但其工作原理却无人知晓;这些算法的输入断言甚至无法公式化,更不用说形式化了。仅举一个例子,一种重要的算法,其名字相当活泼,即“反向 Cuthill-McKee”,多年来,在实验室测试、现场试验和生产中,人们都知道它比凭经验所知的普通 Cuthill-McKee 好得多。然而,直到最近,它的优越性才在理论上得到证明(George,1971),而且即使如此,也只能通过通常的非正式数学证明,而不是通过正式的演绎。在反向 Cuthill-McKee 未经证实的那些年里,尽管它会自动生成任何看起来无法验证的程序,但程序员却顽固地继续使用它。
Back in the real world, the kinds of input/output specifications that accompany production software are seldom simple. They tend to be long and complex and peculiar. To cite an extreme case, computing the payroll for the French National Railroad requires more than 3,000 pay rates (one uphill, one downhill, and so on). The specifications for any reasonable compiler or operating system fill volumes—and no one believes that they are complete. There are even some cases of black-box code, numerical algorithms that can be shown to work in the sense that they are used to build real airplanes or drill real oil wells, but work for no reason that anyone knows; the input assertions for these algorithms are not even formulable, let alone formalizable. To take just one example, an important algorithm with the rather jaunty name of Reverse Cuthill-McKee was known for years to be far better than plain Cuthill-McKee, known empirically, in laboratory tests and field trials and in production. Only recently, however, has its superiority been theoretically demonstrable (George, 1971), and even then only with the usual informal mathematical proof, not with a formal deduction. During all of the years when Reverse Cuthill-McKee was unproved, even though it automatically made any program in which it appeared unverifiable, programmers perversely went on using it.
也许有人会反驳说,虽然现实生活中的规范冗长而复杂,但并不深入。事实上,它们的验证只不过是借助简单代数恒等式来检查的极长的替换链。
It might be countered that while real-life specifications are lengthy and complicated, they are not deep. Their verifications are, in fact, nothing more than extremely long chains of substitutions to be checked with the aid of simple algebraic identities.
对此,我们只能说:正是如此。验证过程漫长、复杂但肤浅;这就是他们的问题所在。即使是一个微不足道的程序的验证也可能需要几十页,而且在这些页面中没有任何轻松的时刻或智慧的火花。没有人会带着程序验证走进朋友的办公室。没有人会在餐巾纸上画出验证草图。没有人会强迫同事听取验证。没有人会读它。一想到这里,人们就会感到目光呆滞。有人建议使用非常高级的语言,可以直接处理广泛的问题数学对象或函数语言的概念,据说可以被简洁地公理化,可以用来确保验证是有趣的,因此对像数学的社会过程这样的社会过程做出响应。从理论上讲,这个想法听起来很有希望;但实际上却是这样。实际上,这是行不通的。……
All we can say in response to this is: Precisely. Verifications are long and involved but shallow; that’s what’s wrong with them. The verification of even a puny program can run into dozens of pages, and there’s not a light moment or a spark of wit on any of those pages. Nobody is going to run into a friend’s office with a program verification. Nobody is going to sketch a verification out on a paper napkin. Nobody is going to buttonhole a colleague into listening to a verification. Nobody is ever going to read it. One can feel one’s eyes glaze over at the very thought. It has been suggested that very high level languages, which can deal directly with a broad range of mathematical objects or functional languages, which it is said can be concisely axiomatized, might be used to insure that a verification would be interesting and therefore responsive to a social process like the social process of mathematics. In theory this idea sounds hopeful; in practice, it doesn’t work out. …
一些验证者会承认验证对于绝大多数程序来说根本行不通,但认为对于一些关键应用程序来说,这种痛苦是值得的。他们指出,空中交通管制、导弹系统和太空探索等领域的风险非常高,因此花费任何时间和精力都是合理的。
Some verifiers will concede that verification is simply unworkable for the vast majority of programs but argue that for a few crucial applications the agony is worthwhile. They point to air-traffic control, missile systems, and the exploration of space as areas in which the risks are so high that any expenditure of time and effort can be justified.
即使是这样,我们仍然坚持验证放弃其对所有其他编程领域的要求;例如,在入门编程课程中教学生如何进行验证,应该像在生物学入门课程中教学生如何进行心脏直视手术一样牵强。但这些风险并不会影响我们的信念,即验证任何足够大且足够灵活的系统来完成任何现实世界的任务是基本不可能的。无论回报有多高,没有人能够强迫自己阅读现实系统中极其冗长、乏味的验证,除非它们可以被阅读、理解和提炼,否则验证是毫无价值的。
Even if this were so, we would still insist that verification renounce its claim on all other areas of programming; to teach students in introductory programming courses how to do verification, for instance, ought to be as farfetched as teaching students in introductory biology how to do open-heart surgery. But the stakes do not affect our belief in the basic impossibility of verifying any system large enough and flexible enough to do any real-world task. No matter how high the payoff, no one will ever be able to force himself to read the incredibly long, tedious verifications of real-life systems, and unless they can be read, understood, and refined, the verifications are worthless.
现在,可能有人会说,所有这些对可读性和内化的引用都是无关紧要的,验证的目的最终是构建一个自动验证系统。
Now, it might be argued that all these references to readability and internalization are irrelevant, that the aim of verification is eventually to construct an automatic verifying system.
不幸的是,有大量证据表明全自动验证系统是不可能的。数学定理的正式证明的长度下限是巨大的(Stockmeyer,1974),并且没有理由相信这种程序的证明会更短或更清晰——恰恰相反。事实上,即使是程序验证的坚定拥护者也没有认真对待完全自动化验证器的可能性。验证的支持者拉尔夫·伦敦 (Ralph London) 谈到了一种午餐外系统,该系统可以在无人监督的情况下进行验证;但他怀疑这样的系统是否能够以合理的可靠性运行。一个团体对可预见的未来自动化感到绝望,提出验证应该由“咕噜数学家”团队来执行,这些低水平的数学团队将检查验证条件。提出这样的建议的人的情感似乎很奇怪,但它们确实表明自动验证的可能性有多遥远。
Unfortunately, there is a wealth of evidence that fully automated verifying systems are out of the question. The lower bounds on the length of formal demonstrations for mathematical theorems are immense (Stockmeyer, 1974), and there is no reason to believe that such demonstrations for programs would be any shorter or cleaner—quite the contrary. In fact, even the strong adherents of program verification do not take seriously the possibility of totally automated verifiers. Ralph London, a proponent of verification, speaks of an out-to-lunch system, one that could be left unsupervised to grind out verifications; but he doubts that such a system can be built to work with reasonable reliability. One group, despairing of automation in the foreseeable future, has proposed that verifications should be performed by teams of “grunt mathematicians,” low level mathematical teams who will check verification conditions. The sensibilities of people who could make such a proposal seem odd, but they do serve to indicate how remote the possibility of automated verification must be.
然而,假设可以以某种方式构建一个自动验证器。进一步假设程序员确实以某种方式对其验证产生了信心。如果这种信念没有任何现实基础,它就只能是盲目信仰,但没关系。假设点金石已经被发现,铅可以变成金,并且程序员相信将他们的程序输入验证者的张口中的优点。在我们看来,验证的支持者设想的场景是这样的:程序员将他的 300 行输入/输出包插入到验证器中。几个小时后,他回来了。有他的 20,000 行验证和“已验证”消息。
Suppose, however, that an automatic verifier could somehow be built. Suppose further that programmers did somehow come to have faith in its verifications. In the absence of any real-world basis for such belief, it would have to be blind faith, but no matter. Suppose that the philosopher’s stone had been found, that lead could be changed to gold, and that programmers were convinced of the merits of feeding their programs into the gaping jaws of a verifier. It seems to us that the scenario envisioned by the proponents of verification goes something like this: The programmer inserts his 300-line input/output package into the verifier. Several hours later, he returns. There is his 20,000-line verification and the message “VERIFIED.”
当我们开始感觉到一个结构在逻辑上是正确的、可证明是正确的时,就会有一种倾向,即从其中删除我们最初由于缺乏理解而建立的任何冗余。极端地讲,这种趋势会带来所谓的泰坦尼克效应。当故障确实发生时,其规模是巨大且无法控制的。换句话说,系统故障的严重程度与设计者认为系统不会故障的信念强度成正比。仅仅为了可以验证而设计的干净整洁的程序将特别容易受到泰坦尼克号效应的影响。我们已经看到了这种现象的迹象。在他们关于 Euclid(Popek 等人,1977)(一种为程序验证而设计的语言)的注释中,几位最重要的验证拥护者说,“因为我们期望所有 Euclid 程序都得到验证,所以我们没有为异常处理做出特殊规定......。经过验证的程序中不应出现运行时软件错误。” 错误不应该发生吗?不应该沉没的船的阴影。
There is a tendency, as we begin to feel that a structure is logically, provably right, to remove from it whatever redundancies we originally built in because of lack of understanding. Taken to its extreme, this tendency brings on the so-called Titanic effect; when failure does occur, it is massive and uncontrolled. To put it another way, the severity with which a system fails is directly proportional to the intensity of the designer’s belief that it cannot fail. Programs designed to be clean and tidy merely so that they can be verified will be particularly susceptible to the Titanic effect. Already we see signs of this phenomenon. In their notes on Euclid (Popek et al., 1977), a language designed for program verification, several of the foremost verification adherents say, “Because we expect all Euclid programs to be verified, we have not made special provisions for exception handling …. Runtime software errors should not occur in verified programs.” Errors should not occur? Shades of the ship that shouldn’t be sunk.
因此,暂时搁置所有理性的怀疑,让我们假设程序员收到“已验证”消息。让我们进一步假设该消息不是由验证系统方面的故障引起的。程序员懂什么?他知道他的计划在形式上、逻辑上、可证明、可证明是正确的。然而,他不知道它在多大程度上是可靠的、可靠的、值得信赖的、安全的;他不知道它会在什么限度内发挥作用;他不知道当超过这些限制时会发生什么。然而他却拥有神秘的认可印记:“已验证”。我们几乎可以看到冰山在不沉船的背景中若隐若现。幸运的是,没有理由担心这样的未来。想象一下同一个程序员返回时发现相同的 20,000 行。假设真的可以建立一个自动验证器,他到底会发现什么信息?当然,该消息将是“未验证”。程序员将进行更改,再次输入程序,然后再次返回。“未经审核的。” 他会再次进行更改,再次将程序提供给验证者,再次“未验证”。程序是人类的产物;现实生活中的程序是复杂的人工制品;任何足够大和复杂的人造制品都是不完美的。该消息永远不会显示“已验证”。
So, having for the moment suspended all rational disbelief, let us suppose that the programmer gets the message “VERIFIED.” And let us suppose further that the message does not result from a failure on the part of the verifying system. What does the programmer know? He knows that his program is formally, logically, provably, certifiably correct. He does not know, however, to what extent it is reliable, dependable, trustworthy, safe; he does not know within what limits it will work; he does not know what happens when it exceeds those limits. And yet he has that mystical stamp of approval: “VERIFIED.” We can almost see the iceberg looming in the background over the unsinkable ship. Luckily, there is little reason to fear such a future. Picture the same programmer returning to find the same 20,000 lines. What message would he really find, supposing that an automatic verifier could really be built? Of course, the message would be “NOT VERIFIED.” The programmer would make a change, feed the program in again, return again. “NOT VERIFIED.” Again he would make a change, again he would feed the program to the verifier, again “NOT VERIFIED.” A program is a human artifact; a real-life program is a complex human artifact; and any human artifact of sufficient size and complexity is imperfect. The message will never read “VERIFIED.”
我们可以粗略地说,如果一个数学思想能够以自然且富有启发性的方式与大量其他数学思想相联系,那么它就是“有意义的”。——GH·哈迪
We may say, roughly, that a mathematical idea is “significant” if it can be connected, in a natural and illuminating way, with a large complex of other mathematical ideas. —G. H. Hardy
唯一真正值得验证的辩护是扩大规模的论点。我们可以尽可能地重现它,它是这样的:
The only really fetching defense ever offered for verification is the scaling-up argument. As best we can reproduce it, here is how it goes:
1. 验证目前还处于起步阶段。目前,它能处理的最大任务是像FIND这样的算法和像GCD这样的模型程序的验证。随着时间的推移,它将能够处理越来越复杂的算法和越来越棘手的模型程序。这些验证可与数学证明相媲美。他们被阅读了。它们与定理一样引起人们的兴趣和兴奋。它们受制于数学推理或任何其他学科的推理的普通社会过程。
1. Verification is now in its infancy. At the moment, the largest tasks it can handle are verifications of algorithms like FIND and model programs like GCD. It will in time be able to tackle more and more complicated algorithms and trickier and trickier model programs. These verifications are comparable to mathematical proofs. They are read. They generate the same kinds of interest and excitement that theorems do. They are subject to the ordinary social processes that work on mathematical reasoning, or on reasoning in any other discipline, for that matter.
2. 大生产系统无非是由算法和模型程序组成。一旦经过验证,算法和模型程序就可以组成大型的日常生产系统,而一个大系统的(不可否认的不可读的)验证将是其组件的许多小型的、有吸引力的、有趣的验证的总和。
2. Big production systems are made up of nothing more than algorithms and model programs. Once verified, algorithms and model programs can make up large, workaday production systems, and the (admittedly unreadable) verification of a big system will be the sum of the many small, attractive, interesting verifications of its components.
对于(1)我们没有争论。事实上,早在计算机发明之前,算法就已经被证明,证明也被阅读、讨论和吸收——而且明显缺乏形式机制。我们的猜测是,算法和模型程序的研究将像任何其他数学活动一样发展,主要通过非正式的社会机制,很少通过正式机制。
With (1) we have no quarrel. Actually, algorithms were proved and the proofs read and discussed and assimilated long before the invention of computers—and with a striking lack of formal machinery. Our guess is that the study of algorithms and model programs will develop like any other mathematical activity, chiefly by informal, social mechanisms, very little if at all by formal mechanisms.
我们对(2)有根本的分歧。我们认为,FIND 或 GCD 的世界与生产软件、编写真实账单的计费系统、安排真实事件的调度系统、发行真实门票的票务系统之间不存在连续性。我们认为生产软件的世界本身就是不连续的。
It is with (2) that we have our fundamental disagreement. We argue that there is no continuity between the world of FIND or GCD and the world of production software, billing systems that write real bills, scheduling systems that schedule real events, ticketing systems that issue real tickets. And we argue that the world of production software is itself discontinuous.
没有程序员会同意大型生产系统只不过是由算法和小程序组成的。补丁、临时结构、创可贴和止血带、花哨的东西、胶水、吐痰和抛光、签名代码、血汗和泪水,当然还有厨房水槽——实践程序员的彩色行话似乎是谈论他所使用的结构的性质;也许理论家应该听听他的意见。据估计,任何实际生产系统中超过一半的代码都由用户界面和错误消息组成,即临时的、非正式的结构,根据定义,这些结构是无法验证的。甚至验证者本身有时似乎也意识到大多数真实软件的不可验证性。CAR Hoare 曾说过:“在许多应用中,算法几乎没有发挥任何作用,当然也几乎不会出现任何问题。” (我们希望我们能够报告说他随即举手并放弃了验证,但没有这样的运气。)
No programmer would agree that large production systems are composed of nothing more than algorithms and small programs. Patches, ad hoc constructions, bandaids and tourniquets, bells and whistles, glue, spit and polish, signature code, blood-sweat-and-tears, and, of course, the kitchen sink—the colorful jargon of the practicing programmer seems to be saying something about the nature of the structures he works with; maybe theoreticians ought to be listening to him. It has been estimated that more than half the code in any real production system consists of user interfaces and error messages—ad hoc, informal structures that are by definition unverifiable. Even the verifiers themselves sometimes seem to realize the unverifiable nature of most real software. C. A. R. Hoare has been quoted as saying, “In many applications, algorithm plays almost no role, and certainly presents almost no problem.” (We wish we could report that he thereupon threw up his hands and abandoned verification, but no such luck.)
或者换个角度来看看 GCD 的世界和生产软件的世界之间的区别:算法的规范简洁明了,而现实系统的规范却是巨大的,通常与系统处于同一数量级他们自己。算法规范高度稳定,稳定数十年甚至数百年;真实系统的规格每天或每小时都在变化(任何程序员都可以证明)。算法规范是可导出的、通用的;真实系统的规范是特殊的和临时的。这些并不是程度的差异。它们是种类上的差异。照顾一个熟睡的孩子一小时并不等同于抚养一个十口之家——问题本质上是根本不同的。
Or look at the difference between the world of GCD and the world of production software in another way: The specifications for algorithms are concise and tidy, while the specifications for real-world systems are immense, frequently of the same order of magnitude as the systems themselves. The specifications for algorithms are highly stable, stable over decades or even centuries; the specifications for real systems vary daily or hourly (as any programmer can testify). The specifications for algorithms are exportable, general; the specifications for real systems are idiosyncratic and ad hoc. These are not differences in degree. They are differences in kind. Babysitting for a sleeping child for one hour does not scale up to raising a family of ten—the problems are essentially, fundamentally different.
在实际生产软件的世界中也不存在连续性。扩大规模的论点似乎基于这样一个模糊概念:编程的世界就像牛顿物理学的世界一样——由平滑、连续的函数组成。但事实上,程序是参差不齐的,充满了洞和洞穴。每个程序员都知道,改变一行,有时甚至一点点,都可能彻底毁掉一个程序,或者以我们不理解和无法预测的方式毁坏它。然而,在其他时候,相当大的变化似乎并没有改变什么。这民间传说中充满了恶作剧和破坏行为的故事,这些行为使肇事者永远不被发现,从而感到沮丧。
And within the world of real production software there is no continuity either. The scaling-up argument seems to be based on the fuzzy notion that the world of programming is like the world of Newtonian physics—made up of smooth, continuous functions. But, in fact, programs are jagged and full of holes and caverns. Every programmer knows that altering a line or sometimes even a bit can utterly destroy a program or mutilate it in ways that we do not understand and cannot predict. And yet at other times fairly substantial changes seem to alter nothing; the folklore is filled with stories of pranks and acts of vandalism that frustrated the perpetrators by remaining forever undetected.
有一个经典的科幻故事,讲述的是一位时间旅行者回到原始丛林观看恐龙,然后回来发现自己的时间几乎被改得面目全非。政治、建筑、语言——甚至植物和动物似乎都是错误的、扭曲的。只有当他脱下穿越服时,他才明白发生了什么。在他的靴子后跟上,被压碎了一只蝴蝶的翅膀,它脱离了过去,因此无法在世界的演变中发挥其作用。每个程序员都知道这种感觉:一个微不足道的微小变化会对一个庞大的系统造成严重破坏。在我们更多地了解编程之前,出于所有实际目的,我们最好将系统视为由蝴蝶的翅膀组成,而不是由算法和较小的程序等坚固的结构组成。
There is a classic science-fiction story about a time traveler who goes back to the primeval jungles to watch dinosaurs and then returns to find his own time altered almost beyond recognition. Politics, architecture, language—even the plants and animals seem wrong, distorted. Only when he removes his time-travel suit does he understand what has happened. On the heel of his boot, carried away from the past and therefore unable to perform its function in the evolution of the world, is crushed the wing of a butterfly. Every programmer knows the sensation: A trivial, minute change wreaks havoc in a massive system. Until we know more about programming, we had better for all practical purposes think of systems as composed, not of sturdy structures like algorithms and smaller programs, but of butterflies’ wings.
编程的不连续性为验证敲响了丧钟。如果一个足够狂热的研究人员能够保证软件保持稳定,他可能愿意花两三年的时间来验证一个重要的软件。但现实生活中的程序需要维护和修改。没有理由相信验证修改后的程序比第一次验证原始程序更容易。没有理由相信一个大验证可以是许多小验证的总和。没有理由相信验证可以转移到任何其他程序,甚至不能转移到与原始程序只有一行不同的程序。
The discontinuous nature of programming sounds the death knell for verification. A sufficiently fanatical researcher might be willing to devote two or three years to verifying a significant piece of software if he could be assured that the software would remain stable. But real-life programs need to be maintained and modified. There is no reason to believe that verifying a modified program is any easier than verifying the original the first time around. There is no reason to believe that a big verification can be the sum of many small verifications. There is no reason to believe that a verification can transfer to any other program—not even to a program only one single line different from the original.
正是这种不连续性消除了通过改进数学证明的各种社会过程来改进验证的可能性。孤独的狂热者可能会构建自己的验证,但他永远没有任何理由阅读其他人的验证,也没有人愿意阅读他的验证。任何社区都无法发展。即使是最热心的验证者,只有当他认为他可能能够使用、借用或刷掉验证中的某些东西时,才会被诱导阅读验证。一旦他明白了任何验证与任何其他验证都没有任何必然联系,那么没有什么可以强迫他去阅读别人的验证。
And it is this discontinuity that obviates the possibility of refining verifications by the sorts of social processes that refine mathematical proofs. The lone fanatic might construct his own verification, but he would never have any reason to read anyone else’s, nor would anyone else ever be willing to read his. No community could develop. Even the most zealous verifier could be induced to read a verification only if he thought he might be able to use or borrow or swipe something from it. Nothing could force him to read someone else’s verification once he had grasped the point that no verification bears any necessary connection to any other verification.
程序本身是程序将做什么的唯一完整描述。——PJ·戴维斯
The program itself is the only complete description of what the program will do. —P. J. Davis
由于计算机可以编写符号并以可忽略不计的能量消耗来移动它们,因此很容易得出这样的结论:在符号领域中一切皆有可能。但现实并不那么容易屈服。物理学不会突然崩溃。不使用资源就不可能构建符号结构,就像不使用资源就不可能构建物质结构一样。即使是最琐碎的数学理论,也有一些简单的陈述,其形式证明将是不可能长的。艾伯特·迈耶(Albert Meyer)关于此类研究历史的杰出演讲最后对推导出相当简单的数学陈述是多么困难做出了惊人的解释。假设我们将逻辑公式编码为二进制字符串,并开始构建一台计算机,该计算机将决定一组长度(例如最多一千位)的简单公式的真实性。假设我们甚至允许自己享受奢侈的生活这项技术将生产由无限细电线连接的质子大小的电子元件。即便如此,我们设计的计算机也必须密集地填充整个可观测宇宙。这种对形式推论长度的精确观察与我们对普通日常数学证明中嵌入的细节数量的直觉一致。我们经常使用“让我们假设,不失一般性…… ”或“因此,如有必要,通过重新编号…… ”来取代大量的正式细节。坚持正式的细节将是对资源的愚蠢浪费。符号结构和物质结构都必须以非常谨慎的眼光来设计。资源有限;时间有限;能量是有限的。即使计算机也无法改变宇宙的有限性。
Since computers can write symbols and move them about with negligible expenditure of energy, it is tempting to leap to the conclusion that anything is possible in the symbolic realm. But reality does not yield so easily; physics does not suddenly break down. It is no more possible to construct symbolic structures without using resources than it is to construct material structures without using them. For even the most trivial mathematical theories, there are simple statements whose formal demonstrations would be impossibly long. Albert Meyer’s outstanding lecture on the history of such research concludes with a striking interpretation of how hard it may be to deduce even fairly simple mathematical statements. Suppose that we encode logical formulas as binary strings and set out to build a computer that will decide the truth of a simple set of formulas of length, say, at most a thousand bits. Suppose that we even allow ourselves the luxury of a technology that will produce proton-size electronic components connected by infinitely thin wires. Even so, the computer we design must densely fill the entire observable universe. This precise observation about the length of formal deductions agrees with our intuition about the amount of detail embedded in ordinary, workaday mathematical proofs. We often use “Let us assume, without loss of generality …” or “Therefore, by renumbering, if necessary …” to replace enormous amounts of formal detail. To insist on the formal detail would be a silly waste of resources. Both symbolic and material structures must be engineered with a very cautious eye. Resources are limited; time is limited; energy is limited. Not even the computer can change the finite nature of the universe.
我们假设这些限制阻止了验证的拥护者提供可能相当令人信服的证据来支持他们的方法。迄今为止,甚至缺乏对工作系统的单一验证有时被归咎于该领域的年轻化。例如,验证者认为他们现在才开始理解循环不变量。乍一看,这听起来像是扩大规模论证的另一种变体。但事实上,现实生活中有一大批系统几乎没有循环——它们很少出现在商业编程应用程序中。然而,从未对打印真实支票的CO OBOL系统进行过验证;即使缺少一个,也让人怀疑未来某个时候是否会出现很多。对于验证者来说,资源、时间和精力与我们其他人一样有限。
We assume that these constraints have prevented the adherents of verification from offering what might be fairly convincing evidence in support of their methods. The lack at this late date of even a single verification of a working system has sometimes been attributed to the youth of the field. The verifiers argue, for instance, that they are only now beginning to understand loop invariants. At first blush, this sounds like another variant of the scaling-up argument. But in fact there are large classes of real-life systems with virtually no loops—they scarcely ever occur in commercial programming applications. And yet there has never been a verification of, say, a COBOL system that prints real checks; lacking even one makes it seem doubtful that there could at some time in the future be many. Resources, and time, and energy are just as limited for verifiers as they are for all the rest of us.
因此,我们必须解决困扰了许多代工程师的两个问题:首先,人们必须投入到他们不理解的活动中。其次,人们无法创造出完美的机制。
We must therefore come to grips with two problems that have occupied engineers for many generations: First, people must plunge into activities that they do not understand. Second, people cannot create perfect mechanisms.
那么工程师如何设法创建可靠的结构呢?首先,他们使用与数学社会过程非常相似的社会过程来实现理解的逐次逼近。其次,他们对“可靠”的含义有成熟而现实的看法;特别是,它从来不意味着“完美”。没有办法从逻辑上推断出桥梁是否存在、飞机是否在飞行、或者发电站是否在输送电力。确实,如果工程师在建造它们之前首先展示其完美性,那么桥梁就不会倒塌,飞机就不会坠毁,电力系统就不会停电——这是真的,因为它们根本就不会被建造出来。
How then do engineers manage to create reliable structures? First, they use social processes very like the social processes of mathematics to achieve successive approximations at understanding. Second, they have a mature and realistic view of what “reliable” means; in particular, the one thing it never means is “perfect.” There is no way to deduce logically that bridges stand, or that airplanes fly, or that power stations deliver electricity. True, no bridges would fall, no airplanes would crash, no electrical systems black out if engineers would first demonstrate their perfection before building them—true because they would never be built at all.
编程中的类比是任何正常运行的、有用的、现实世界的系统。以称为 SYNCHEM 的有机化学合成器(Gelernter 等人,1973)为例。对于这个程序,可靠性的标准特别简单——如果它合成了一种化学物质,它就有效;如果它合成了一种化学物质,它就有效;如果它合成了一种化学物质,它就有效;如果它合成了一种化学物质,它就有效;如果它合成了一种化学物质,它就有效;如果它合成了一种化学物质,它就有效。如果不这样做,它就不起作用。再多的正确性也无法指望在这个标准上有所提高;事实上,人们根本不清楚如何开始以一种有助于核查的方式正式制定这一标准。但尝试增加该程序可以合成的化学品数量是一项有用且持续的事业。
The analogy in programming is any functioning, useful, real-world system. Take for instance an organic-chemical synthesizer called SYNCHEM (Gelernter et al., 1973). For this program, the criterion of reliability is particularly straightforward—if it synthesizes a chemical, it works; if it doesn’t, it doesn’t work. No amount of correctness could ever hope to improve on this standard; indeed, it is not at all clear how one could even begin to formalize such a standard in a way that would lend itself to verification. But it is a useful and continuing enterprise to try to increase the number of chemicals the program can synthesize.
正是符号沙文主义让计算机科学家认为我们的结构比物质结构重要得多,以至于(a)它们应该是完美的,(b)应该消耗使它们完美所需的能量。相反,我们认为 (a) 他们不能是完美的,并且 (b) 不应将精力浪费在徒劳地试图使它们完美上。数学真理的概率观点与可靠性的工程概念密切相关,这并非偶然。也许我们应该明确区分程序可靠性和程序完美性,并将我们的精力集中在可靠性上。
It is nothing but symbol chauvinism that makes computer scientists think that our structures are so much more important than material structures that (a) they should be perfect, and (b) the energy necessary to make them perfect should be expended. We argue rather that (a) they cannot be perfect, and (b) energy should not be wasted in the futile attempt to make them perfect. It is no accident that the probabilistic view of mathematical truth is closely allied to the engineering notion of reliability. Perhaps we should make a sharp distinction between program reliability and program perfection—and concentrate our efforts on reliability.
使程序正确的愿望是有建设性的、有价值的。但是,单一的验证观点忽视了接受正确性标准(例如真实数学证明的正确性标准)或可靠性标准(例如真实工程结构的标准)可能带来的好处。对经济限制内的可操作性的追求、通过重复利用成功设计来引导创新的意愿、对同行社区运作的信任——所有使工程和数学真正发挥作用的机制都在对完美可验证性的徒劳探索中被掩盖了。
The desire to make programs correct is constructive and valuable. But the monolithic view of verification is blind to the benefits that could result from accepting a standard of correctness like the standard of correctness for real mathematical proofs, or a standard of reliability like the standard for real engineering structures. The quest for workability within economic limits, the willingness to channel innovation by recycling successful design, the trust in the functioning of a community of peers—all the mechanisms that make engineering and mathematics really work are obscured in the fruitless search for perfect verifiability.
哪些元素可以使编程更像工程和数学?可以利用的一种机制是创建通用结构,随着通用结构可靠性的增加,其特定实例变得更加可靠。这个概念已经出现了好几个版本,其中 Knuth 坚持创建和理解普遍有用的算法是最重要和令人鼓舞的之一。Baker 的团队编程方法(Baker,1972)是一种将软件暴露给社会过程的明确尝试。如果可重用性成为有效设计的标准,越来越广泛的社区将研究最常见的编程工具。
What elements could contribute to making programming more like engineering and mathematics? One mechanism that can be exploited is the creation of general structures whose specific instances become more reliable as the reliability of the general structure increases. This notion has appeared in several incarnations, of which Knuth’s insistence on creating and understanding generally useful algorithms is one of the most important and encouraging. Baker’s team-programming methodology (Baker, 1972) is an explicit attempt to expose software to social processes. If reusability becomes a criterion for effective design, a wider and wider community will examine the most common programming tools.
可验证软件的概念已经存在太久了,不能轻易被取代。然而,对于编程实践来说,可验证性决不能掩盖可靠性。科学家不应该将数学模型与现实混淆——验证只不过是可信度的模型。可验证性不是也不可能成为软件设计中的主要关注点。经济学、最后期限、成本效益比、个人和团队风格、可接受错误的限制——所有这些在设计中比可验证性或不可验证性具有更大的重要性。
The concept of verifiable software has been with us too long to be easily displaced. For the practice of programming, however, verifiability must not be allowed to overshadow reliability. Scientists should not confuse mathematical models with reality—and verification is nothing but a model of believability. Verifiability is not and cannot be a dominating concern in software design. Economics, deadlines, cost-benefit ratios, personal and group style, the limits of acceptable error—all these carry immensely much more weight in design than verifiability or nonverifiability.
到目前为止,还很少有关于使软件变得可靠而不是可验证的哲学讨论。如果验证拥护者能够重新定义他们的努力并重新定位自己以实现这一目标,或者如果可以出现另一种软件观点,利用数学的社会过程和工程学的适度期望,那么现实生活中的编程和理论计算机科学的兴趣可能会增加。两者都得到更好的服务。
So far, there has been little philosophical discussion of making software reliable rather than verifiable. If verification adherents could redefine their efforts and reorient themselves to this goal, or if another view of software could arise that would draw on the social processes of mathematics and the modest expectations of engineering, the interests of real-life programming and theoretical computer science might both be better served.
即使由于某种我们现在无法理解的原因,我们应该被证明完全错误而验证者完全正确,现在也不是限制编程研究的时候。我们现在所知甚少,无法判断哪些方向将最富有成效。如果我们的推理无法说服任何人,如果验证似乎仍然是一种值得探索的途径,那就这样吧;我们三人只能试图反对核查,而不是把它从地球表面炸掉。但我们恳请我们的朋友和同事不要将他们的视野缩小到这一观点,无论它看起来多么有希望。让它不再是唯一的风景,唯一的途径。雅各布·布罗诺夫斯基(Jacob Bronowski)对另一学科历史上的一个时期有一个重要的见解,这个时期可能与我们所处的发展时期相似。计算:“一门过早整理思想的科学会被扼杀……。中世纪炼金术士希望改变元素的希望并不像我们曾经想象的那么幻想。但这只是对还不了解水和食盐成分的化学物质造成的损害。”
Even if, for some reason that we are not now able to understand, we should be proved wholly wrong and the verifiers wholly right, this is not the moment to restrict research on programming. We know too little now to sense what directions will be most fruitful. If our reasoning convinces no one, if verification still seems an avenue worth exploring, so be it; we three can only try to argue against verification, not blast it off the face of the earth. But we implore our friends and colleagues not to narrow their vision to this one view no matter how promising it may seem. Let it not be the only view, the only avenue. Jacob Bronowski has an important insight about a time in the history of another discipline that may be similar to our own time in the development of computing: “A science which orders its thought too early is stifled …. The hope of the medieval alchemists that the elements might be changed was not as fanciful as we once thought. But it was merely damaging to a chemistry which did not yet understand the composition of water and common salt.”
转载自 DeMillo 等人。(1977,1979),经计算机协会许可。
Reprinted from DeMillo et al. (1977, 1979), with permission from the Association for Computing Machinery.
Diffie 和 Hellman(1976a,此处为第 42 章)的副本很快就传到了麻省理工学院,在那里 Ronald Rivest(生于 1947 年)、Leonard Adleman(生于 1945 年)和 Adi Shamir(生于 1952 年)开始研究寻找问题适用于密钥分发和数字签名问题的公钥密码系统。最终,三人想出了如何使用因式分解问题作为这样一个系统的基础。20 世纪 90 年代末,当万维网开始商业化使用时,RSA 算法开始广泛部署在互联网安全软件中。如今,每笔银行交易、电子商务购买和拼车召唤都涉及用户浏览器或应用程序与提供服务的计算机之间进行的不言而喻的密钥交换。
A copy of Diffie and Hellman (1976a, here chapter 42) quickly reached MIT, where Ronald Rivest (b. 1947), Leonard Adleman (b. 1945), and Adi Shamir (b. 1952) set to work on the problem of finding a public-key cryptosystem suitable for both the key distribution and digital signature problems. Eventually the three figured out how to use the factoring problem as the basis for such a system. The RSA algorithm became widely deployed in internet security software in the late 1990s when the World Wide Web began to be used commercially. Today, every banking transaction, e-commerce purchase, and rideshare summons involves an unspoken key exchange worked out between the user’s browser or app and the computers offering the service.
这篇论文是对算法之美以及最纯粹数学的实用性的非凡证明。当作者年轻时,没有人准备成为一名计算机科学家,他们可能会决定研究数论,并期望它对构建计算机系统很重要,但这三个人对这个主题足够了解,可以集思广益,挑战并纠正彼此的想法。如何使用它的想法。如果您了解一些有关整数属性的 18 世纪数学知识,该算法非常优雅且易于描述。然而,它取决于一个未经证实的前提:没有快速算法来查找大数因子。
The paper is a remarkable testament to the beauty of algorithms and to the utility of even the purest mathematics. No one preparing to be a computer scientist when the authors were young could have decided to study number theory in the expectation that it would be important in building computer systems, but these three knew the subject well enough to brainstorm, challenge, and correct each other’s ideas of how to use it. The algorithm is elegant and easy to describe—if you know a bit of eighteenth-century mathematics about the properties of integers. Yet it depends on an unproven premise: that there is no fast algorithm for finding factors of large numbers.
因式分解的明显困难是众所周知的。1801 年,卡尔·弗里德里希·高斯 (Carl Friedrich Gauss) 观察到,“古代和现代”数学家设计的最佳方法“甚至考验熟练计算器的耐心”,并敦促“探索一切可能的方法来解决如此优雅和如此著名的问题” ”(Gauss,1986,第 396f 页)——从而预示了本文倒数第二段所敦促的议程。逻辑钢琴的发明者威廉·杰文斯(本书第 27 页)指出,因式分解似乎只是我们现在所说的单向函数之一。“在很多情况下,我们可以轻松且无误地做某件事,但要撤销它可能会遇到很多麻烦。……给定任意两个数字,我们可以通过一个简单且无误的过程获得他们的产品;但是当给定一个大数字时,确定其因素就完全是另一回事了”(Jevons,1874,第 122 页)。尽管自广泛采用 RSA 算法以来付出了巨大的努力,但分解成本呈指数级增长的说法经常被断言(包括杰文斯),但从未得到证实。
The apparent difficulty of factoring was already well known. In 1801 Carl Friedrich Gauss observed that the best methods devised by both “ancient and modern” mathematicians “try the patience of even the practiced calculator,” and urged that “every possible means be explored for the solution of a problem so elegant and so celebrated” (Gauss, 1986, pages 396f.)—thus foreshadowing the agenda urged in the penultimate paragraph of this paper. William Jevons, inventor of the logic piano (page 27 of this volume), noted that factorization seemed to be only one of what we would now call one-way functions. “There are many cases in which we can easily and infallibly do a certain thing but may have much trouble in undoing it. … Given any two numbers, we may by a simple and infallible process obtain their product; but when a large number is given it is quite another matter to determine its factors” (Jevons, 1874, page 122). That factoring is exponentially costly has often been asserted (including by Jevons) but never proved, in spite of strenuous efforts since the widespread adoption of the RSA algorithm.
快速分解会破坏 RSA 密码系统,而据我们所知,该系统仍然没有被破坏;某些政府机构或犯罪分子总是有可能开发出一种秘密技术来分解大数。但随着技术的进步,密钥长度本文提出的建议在实际操作中得到了增加。虽然基于其他数学对象(例如椭圆曲线)的公钥系统已经开发出来,但它们也尚未被证明是安全的。
Fast factoring would break the RSA cryptosystem, and the system remains unbroken—as far as we know; there is always the possibility that some government agency or criminal has developed a secret technique for factoring large numbers. But as technology advanced, the key lengths suggested in this paper have been increased in actual practice. And while public-key systems based on other mathematical objects (for example, elliptic curves) have been developed, they too suffer from not, as yet, having been proved secure.
尽管作者建议熟悉 Diffie 和 Hellman(1976a)的读者可以跳过本文的前几节,但我们将它们包括在内,因为它们直接地列出了上下文,并且值得注意的是,介绍了“Alice”和“Bob”,他们在随后的许多文献都扮演了沟通双方的角色。
Though the authors suggest that readers familiar with Diffie and Hellman (1976a) can skip the first few sections of this paper, we include them because they straightforwardly lay out the context—and, notably, introduce “Alice” and “Bob,” who in much of the subsequent literature play the roles of the communicating parties.
里维斯特仍然在麻省理工学院任教,阿德曼在南加州大学任教,沙米尔在以色列魏茨曼研究所任教。这三者都对计算机科学的其他领域做出了重大贡献(特别参见第 46 章)。1983 年,他们成立了一家公司,将本文中提出的发现商业化(RSA Security,被 Security Dynamics 收购,Security Dynamics 又被 EMC 收购,然后又被 Dell 收购)。2002年,他们共同获得了图灵奖。
Rivest remains on the faculty at MIT, while Adleman is at the University of Southern California and Shamir is at the Weizmann Institute in Israel. All three have contributed significantly to other areas of computer science (see chapter 46 in particular). In 1983 they founded a company to commercialize the discovery presented in this paper (RSA Security, which was acquired by Security Dynamics, which in turn was acquired by EMC, which was then acquired by Dell). In 2002 they were jointly recognized with the Turing Award.
N加密方法具有新颖的特性,即公开泄露加密密钥不会因此泄露相应的解密密钥。这有两个重要的后果。
AN encryption method is presented with the novel property that publicly revealing an encryption key does not thereby reveal the corresponding decryption key. This has two important consequences.
1. 不需要快递或其他安全手段来传输密钥,因为可以使用预期接收者公开透露的加密密钥对消息进行加密。只有他才能解密该消息,因为只有他知道相应的解密密钥。
1. Couriers or other secure means are not needed to transmit keys, since a message can be enciphered using an encryption key publicly revealed by the intended recipient. Only he can decipher the message, since only he knows the corresponding decryption key.
2. 可以使用私有解密密钥对消息进行“签名”。任何人都可以使用相应的公开披露的加密密钥来验证此签名。签名无法伪造,签名者以后也无法否认其签名的有效性。这在“电子邮件”和“电子资金转账”系统中有明显的应用。消息的加密方法是将其表示为数字M,将M进行公开指定的幂e,然后将结果除以两个大秘密素数p和q的公开指定的乘积n时取余数。解密类似,只是使用不同的秘密幂d ,其中e · d == 1 (mod ( p − 1) · ( q − 1))。系统的安全性部分取决于对已发布除数n进行因式分解的难度。
2. A message can be “signed” using a privately held decryption key. Anyone can verify this signature using the corresponding publicly revealed encryption key. Signatures cannot be forged, and a signer cannot later deny the validity of his signature. This has obvious applications in “electronic mail” and “electronic funds transfer” systems. A message is encrypted by representing it as a number M, raising M to a publicly specified power e and then taking the remainder when the result is divided by the publicly specified product n of two large secret prime numbers p and q. Decryption is similar, only a different, secret, power d is used, where e · d ≡ 1 (mod (p − 1) · (q − 1)). The security of the system rests in part on the difficulty of factoring the published divisor, n.
“电子邮件”时代(Potter,1977)可能很快就会到来;我们必须确保保留当前“纸质邮件”系统的两个重要属性:(a)消息是私密的,以及(b)消息可以签名。我们在本文中演示了如何将这些功能构建到电子邮件系统中。
The era of “electronic mail” (Potter, 1977) may soon be upon us; we must ensure that two important properties of the current “paper mail” system are preserved: (a) messages are private, and (b) messages can be signed. We demonstrate in this paper how to build these capabilities into an electronic mail system.
我们提案的核心是一种新的加密方法。该方法提供了“公钥密码系统”的实现,这是 Diffie 和 Hellman 发明的一个优雅的概念(1976a,此处第 42 章)。他们的文章激发了我们的研究,因为他们提出了这样一个系统的概念,但没有实际实现。熟悉 Diffie 和 Hellman (1976a) 的读者可能希望直接跳到第45.5节来了解我们方法的描述。
At the heart of our proposal is a new encryption method. This method provides an implementation of a “public-key cryptosystem,” an elegant concept invented by Diffie and Hellman (1976a, here chapter 42). Their article motivated our research, since they presented the concept but not any practical implementation of such a system. Readers familiar with Diffie and Hellman (1976a) may wish to skip directly to §45.5 for a description of our method.
在“公钥密码系统”中,每个用户将加密过程E放入公共文件中。也就是说,公共文件是给出每个用户的加密过程的目录。用户对其相应的解密过程D的细节保密。这些过程具有以下四个属性:
In a “public-key cryptosystem” each user places in a public file an encryption procedure E. That is, the public file is a directory giving the encryption procedure of each user. The user keeps secret the details of his corresponding decryption procedure D. These procedures have the following four properties:
(a) 解密消息 M 的加密形式,得到 M。形式上,
(a) Deciphering the enciphered form of a message M yields M. Formally,
(b) E和D都很容易计算。
(b) Both E and D are easy to compute.
(c) 通过公开揭示E,用户并未揭示计算D的简单方法。这意味着实际上只有他才能解密用E加密的消息,或者有效地计算D。
(c) By publicly revealing E the user does not reveal an easy way to compute D. This means that in practice only he can decrypt messages encrypted with E, or compute D efficiently.
(d) 如果先对消息M进行解密,然后再对其进行加密,则结果为M。正式地,
(d) If a message M is first deciphered and then enciphered, M is the result. Formally,
加密(或解密)过程通常由通用方法和加密密钥组成。一般方法是在密钥的控制下对消息M进行加密,得到消息的加密形式,称为密文C。每个人都可以使用相同的通用方法;给定程序的安全性取决于密钥的安全性。公开加密算法就意味着公开密钥。
An encryption (or decryption) procedure typically consists of a general method and an encryption key. The general method, under control of the key, enciphers a message M to obtain the enciphered form of the message, called the ciphertext C. Everyone can use the same general method; the security of a given procedure will rest on the security of the key. Revealing an encryption algorithm then means revealing the key.
当用户揭示E时,他揭示了一种非常低效的计算D ( C )的方法:测试所有可能的消息M,直到找到E ( M ) = C的消息。如果满足属性 (c),则要测试的此类消息的数量将非常大,以致于这种方法不切实际。
When the user reveals E he reveals a very inefficient method of computing D(C): testing all possible messages M until one such that E(M) = C is found. If property (c) is satisfied the number of such messages to test will be so large that this approach is impractical.
满足(a)-(c)的函数E是“活板门单向函数”;如果它也满足 (d),则它是“活板门单向排列”。Diffie 和 Hellman(1976a,此处第 42 章)介绍了陷门单向函数的概念,但没有提供任何示例。这些函数被称为“单向”,因为它们在一个方向上很容易计算,但在另一个方向上(显然)很难计算。它们被称为“活板门”函数,因为一旦知道某些私有“活板门”信息,反函数实际上很容易计算。也满足(d)的陷门单向函数必须是一种排列:每个消息都是其他消息的密文,并且每个密文本身都是允许的消息。(映射是“一对一”和“到”。)仅在实现“签名”时才需要属性 (d)。
A function E satisfying (a)–(c) is a “trap-door one-way function”; if it also satisfies (d) it is a “trap-door one-way permutation.” Diffie and Hellman (1976a, here chapter 42) introduced the concept of trap-door one-way functions but did not present any examples. These functions are called “one-way” because they are easy to compute in one direction but (apparently) very difficult to compute in the other direction. They are called “trap-door” functions since the inverse functions are in fact easy to compute once certain private “trap-door” information is known. A trap-door one-way function which also satisfies (d) must be a permutation: every message is the ciphertext for some other message and every ciphertext is itself a permissible message. (The mapping is “one-to-one” and “onto.”) Property (d) is needed only to implement “signatures.”
我们鼓励读者阅读 Diffie 和 Hellman 的优秀文章,了解更多背景知识、阐述公钥密码系统的概念以及讨论其他问题在密码学领域。公钥密码系统确保隐私和启用“签名”(在下面第45.3和45.4节中描述)的方法也归功于 Diffie 和 Hellman。
The reader is encouraged to read Diffie and Hellman’s excellent article for further background, for elaboration of the concept of a public-key cryptosystem, and for a discussion of other problems in the area of cryptography. The ways in which a public-key cryptosystem can ensure privacy and enable “signatures” (described in §§45.3 and 45.4 below) are also due to Diffie and Hellman.
对于我们的场景,我们假设 A 和 B(也称为 Alice 和 Bob)是公钥密码系统的两个用户。我们用下标来区分它们的加密和解密过程:E A , D A , E B , D B。
For our scenarios we suppose that A and B (also known as Alice and Bob) are two users of a public-key cryptosystem. We will distinguish their encryption and decryption procedures with subscripts: EA, DA, EB, DB.
加密是使通信保密的标准方法。发送者在将每条消息传输到接收者之前对其进行加密。接收者(但不是未经授权的人)知道适用于接收到的消息的适当解密函数以获得原始消息。听到传输消息的窃听者只听到“垃圾”(密文),这对他来说毫无意义,因为他不知道如何解密。
Encryption is the standard means of rendering a communication private. The sender enciphers each message before transmitting it to the receiver. The receiver (but no unauthorized person) knows the appropriate deciphering function to apply to the received message to obtain the original message. An eavesdropper who hears the transmitted message hears only “garbage” (the ciphertext) which makes no sense to him since he does not know how to decrypt it.
目前计算机化数据库中保存着大量个人和敏感信息,并通过电话线传输,这使得加密变得越来越重要。认识到高效、高质量的加密技术非常需要但又供不应求这一事实,美国国家标准局最近采用了 IBM 开发的“数据加密标准”(《联邦公报》:第 40 卷,第 40期) 42,1975 年 3 月 17 日;第 40 卷,第 149 期,1975 年 8 月 1 日)。新标准没有实现公钥密码系统所需的属性 (c)。
The large volume of personal and sensitive information currently held in computerized data banks and transmitted over telephone lines makes encryption increasingly important. In recognition of the fact that efficient, high-quality encryption techniques are very much needed but are in short supply, the National Bureau of Standards has recently adopted a “Data Encryption Standard,” developed at IBM (Federal Register: Vol. 40, No. 42, March 17, 1975; Vol. 40, No. 149, August 1, 1975). The new standard does not have property (c), needed to implement a public-key cryptosystem.
所有经典加密方法(包括 NBS 标准)都遭受“密钥分配问题”的困扰。问题是,在私人通信开始之前,需要另一笔私人交易来分别向发送者和接收者分发相应的加密和解密密钥。通常,私人信使用于将密钥从发送者运送到接收者。如果电子邮件系统要快速且便宜,则这种做法是不可行的。公钥密码系统不需要私人信使;密钥可以通过不安全的通信通道分发。
All classical encryption methods (including the NBS standard) suffer from the “key distribution problem.” The problem is that before a private communication can begin, another private transaction is necessary to distribute corresponding encryption and decryption keys to the sender and receiver, respectively. Typically a private courier is used to carry a key from the sender to the receiver. Such a practice is not feasible if an electronic mail system is to be rapid and inexpensive. A public-key cryptosystem needs no private couriers; the keys can be distributed over the insecure communications channel.
在公钥密码系统中,Bob 如何向 Alice 发送私人消息M ?首先,他从公共文件中检索E A。然后他向她发送加密消息E A ( M )。Alice 通过计算D A ( E A ( M )) = M来解密消息。根据公钥密码系统的属性 (c),只有她可以破译E A ( M )。她可以使用E B加密私人回复,该回复也可以在公共文件中找到。
How can Bob send a private message M to Alice in a public-key cryptosystem? First, he retrieves EA from the public file. Then he sends her the enciphered message EA(M). Alice deciphers the message by computing DA(EA(M)) = M. By property (c) of the public-key cryptosystem only she can decipher EA(M). She can encipher a private response with EB, also available in the public file.
请注意,Alice 和 Bob 之间不需要进行任何私人交易即可建立私人通信。唯一需要的“设置”是每个希望接收私人通信的用户必须将其加密算法放入公共文件中。
Observe that no private transactions between Alice and Bob are needed to establish private communication. The only “setup” required is that each user who wishes to receive private communications must place his enciphering algorithm in the public file.
两个用户还可以通过不安全的通信通道建立私人通信,而无需查阅公共文件。每个用户将他的加密密钥发送给另一个用户。之后,所有消息都使用接收者的加密密钥进行加密,就像在公钥系统中一样。监听通道的入侵者无法破译任何消息,因为不可能从加密密钥导出解密密钥。(我们假设入侵者无法修改或插入消息到通道中。)Ralph Merkle (1978) 针对这个问题开发了另一种解决方案。
Two users can also establish private communication over an insecure communications channel without consulting a public file. Each user sends his encryption key to the other. Afterwards all messages are enciphered with the encryption key of the recipient, as in the public-key system. An intruder listening in on the channel cannot decipher any messages, since it is not possible to derive the decryption keys from the encryption keys. (We assume that the intruder cannot modify or insert messages into the channel.) Ralph Merkle (1978) has developed another solution to this problem.
公钥密码系统可用于“引导”标准加密方案,例如 NBS 方法。一旦建立了安全通信,传输的第一条消息就可以成为 NBS 方案中使用的密钥,以对所有后续消息进行编码。如果我们的方法的加密速度比标准方案慢,那么这可能是可取的。(如果使用专用硬件加密设备,NBS 方案可能会更快一些;我们的方案在通用计算机上可能会更快,因为多精度算术运算比复杂的位操作更容易实现。)
A public-key cryptosystem can be used to “bootstrap” into a standard encryption scheme such as the NBS method. Once secure communications have been established, the first message transmitted can be a key to use in the NBS scheme to encode all following messages. This may be desirable if encryption with our method is slower than with the standard scheme. (The NBS scheme is probably somewhat faster if special-purpose hardware encryption devices are used; our scheme may be faster on a general-purpose computer since multiprecision arithmetic operations are simpler to implement than complicated bit manipulations.)
如果电子邮件系统要取代现有的纸质邮件系统进行商业交易,则必须能够“签署”电子消息。签名消息的收件人有证据证明该消息源自发件人。这种质量比单纯的身份验证(接收者可以验证消息是否来自发送者)更强;接收者可以让“法官”相信签名者发送了消息。为此,他必须让法官相信他本人没有伪造签名信息!在身份验证问题中,接收者并不担心这种可能性,因为他只想确保消息来自发送者。
If electronic mail systems are to replace the existing paper mail system for business transactions, “signing” an electronic message must be possible. The recipient of a signed message has proof that the message originated from the sender. This quality is stronger than mere authentication (where the recipient can verify that the message came from the sender); the recipient can convince a “judge” that the signer sent the message. To do so, he must convince the judge that he did not forge the signed message himself! In an authentication problem the recipient does not worry about this possibility, since he only wants to satisfy himself that the message came from the sender.
电子签名必须依赖于消息以及依赖于签名者。否则,接收者可以在向法官展示消息签名对之前修改消息。或者他可以将签名附加到任何消息上,因为不可能检测到电子“剪切和粘贴”。
An electronic signature must be message-dependent, as well as signer-dependent. Otherwise the recipient could modify the message before showing the message-signature pair to a judge. Or he could attach the signature to any message whatsoever, since it is impossible to detect electronic “cutting and pasting.”
为了实现签名,公钥密码系统必须使用陷门单向排列(即具有属性(d))来实现,因为解密算法将应用于未加密的消息。
To implement signatures the public-key cryptosystem must be implemented with trap-door one-way permutations (i.e. have property (d)), since the decryption algorithm will be applied to unenciphered messages.
用户 Bob 如何在公钥密码系统中向 Alice 发送“签名”消息 M?他首先使用DB计算消息M的“签名” S: S = DB ( M )。(根据公钥密码系统的属性 (d),解密未加密的消息“有意义”:每条消息都是其他消息的密文。)然后,他使用E A加密S(出于隐私考虑),并发送结果E A ( S ) 给爱丽丝。他也不必发送M ;它可以从S计算出来。
How can user Bob send Alice a “signed” message M in a public-key cryptosystem? He first computes his “signature” S for the message M using DB: S = DB(M). (Deciphering an unenciphered message “makes sense” by property (d) of a public-key cryptosystem: each message is the ciphertext for some other message.) He then encrypts S using EA (for privacy), and sends the result EA(S) to Alice. He need not send M as well; it can be computed from S.
Alice首先用D A解密密文得到S。她知道谁是签名的假定发送者(在本例中是鲍勃);如有必要,可以以附加到S 的纯文本形式给出。然后,她使用发送者的加密过程提取消息,在本例中为E B(可在公共文件中找到):M = E B ( S )。她现在拥有一个消息签名对 ( M, S ),其属性类似于已签名的纸质文档的属性。
Alice first decrypts the ciphertext with DA to obtain S. She knows who is the presumed sender of the signature (in this case, Bob); this can be given if necessary in plain text attached to S. She then extracts the message with the encryption procedure of the sender, in this case EB (available on the public file): M = EB(S). She now possesses a message-signature pair (M, S) with properties similar to those of a signed paper document.
鲍勃后来无法否认向爱丽丝发送了这条消息,因为没有其他人可以创建S = D B ( M )。Alice 可以让“法官”相信E B ( S ) = M,因此她有证据证明 Bob 签署了该文件。
Bob cannot later deny having sent Alice this message, since no one else could have created S = DB(M). Alice can convince a “judge” that EB(S) = M, so she has proof that Bob signed the document.
显然 Alice 无法将M修改为不同的版本M ′,因为那样她也必须创建相应的签名S ′ = D B ( M ′)。
Clearly Alice cannot modify M to a different version M′, since then she would have to create the corresponding signature S′ = DB(M′) as well.
因此,爱丽丝收到了鲍勃“签名”的消息,她可以“证明”鲍勃发送了该消息,但她无法修改该消息。(她也不能伪造他的签名来发送任何其他消息。)
Therefore Alice has received a message “signed” by Bob, which she can “prove” that he sent, but which she cannot modify. (Nor can she forge his signature for any other message.)
电子检查系统可以基于诸如上述的签名系统。很容易想象您的家庭终端中的加密设备允许您签署通过电子邮件发送给收款人的支票。只需在每张支票中包含唯一的支票号码,这样即使收款人复制支票,银行也只会兑现其看到的第一个版本。
An electronic checking system could be based on a signature system such as the above. It is easy to imagine an encryption device in your home terminal allowing you to sign checks that get sent by electronic mail to the payee. It would only be necessary to include a unique check number in each check so that even if the payee copies the check the bank will only honor the first version it sees.
如果加密设备能够做得足够快,就会出现另一种可能性:在电话交谈中,所说的每个字在传输之前都会由加密设备签名。
Another possibility arises if encryption devices can be made fast enough: it will be possible to have a telephone conversation in which every word spoken is signed by the encryption device before transmission.
当加密用于如上所述的签名时,重要的是加密设备不要“连接”在终端(或计算机)和通信通道之间,因为消息可能必须用多个密钥连续加密。将加密设备视为可以根据需要执行的“硬件子例程”也许更自然。
When encryption is used for signatures as above, it is important that the encryption device not be “wired in” between the terminal (or computer) and the communications channel, since a message may have to be successively enciphered with several keys. It is perhaps more natural to view the encryption device as a “hardware subroutine” that can be executed as needed.
上面我们假设每个用户总是可以可靠地访问公共文件。在“计算机网络”中,这可能很困难;“入侵者”可能会伪造消息,声称来自公共文件。用户希望确保他确实获得了他想要的通信者的加密过程,而不是入侵者的加密过程。如果公共文件“签署”它发送给用户的每条消息,这种危险就会消失。用户可以使用公共文件的加密算法E PF来检查签名。通过在每个用户第一次(亲自)出现加入公钥密码系统并存放其公共加密过程时向每个用户提供E PF的描述,可以避免在公共文件中“查找” E PF本身的问题。然后他存储了这个描述,而不是再次查找它。因此,当用户加入系统时,每对用户之间对信使的需要已被每个用户和公共文件管理器之间的单个安全会议的要求所取代。另一种解决方案是,当每个用户注册时,为他提供一本书(如电话簿),其中包含系统中用户的所有加密密钥。
We have assumed above that each user can always access the public file reliably. In a “computer network” this might be difficult; an “intruder” might forge messages purporting to be from the public file. The user would like to be sure that he actually obtains the encryption procedure of his desired correspondent and not, say, the encryption procedure of the intruder. This danger disappears if the public file “signs” each message it sends to a user. The user can check the signature with the public file’s encryption algorithm EPF. The problem of “looking up” EPF itself in the public file is avoided by giving each user a description of EPF when he first shows up (in person) to join the public-key cryptosystem and to deposit his public encryption procedure. He then stores this description rather than ever looking it up again. The need for a courier between every pair of users has thus been replaced by the requirement for a single secure meeting between each user and the public-file manager when the user joins the system. Another solution is to give each user, when he signs up, a book (like a telephone directory) containing all the encryption keys of users in the system.
要使用我们的方法使用公共加密密钥 ( e, n )加密消息M,请按以下步骤操作。(这里e和n是一对正整数。)
To encrypt a message M with our method, using a public encryption key (e, n), proceed as follows. (Here e and n are a pair of positive integers.)
首先,将消息表示为 0 到n − 1之间的整数。(将长消息分成一系列块,并将每个块表示为这样的整数。)使用任何标准表示。这里的目的不是加密消息,而只是将其转换为加密所需的数字形式。
First, represent the message as an integer between 0 and n − 1. (Break a long message into a series of blocks, and represent each block as such an integer.) Use any standard representation. The purpose here is not to encrypt the message but only to get it into the numeric form necessary for encryption.
然后,通过将消息提高到e次幂模n来加密消息。即,结果(密文C)为Me除以n的余数。
Then, encrypt the message by raising it to the eth power modulo n. That is, the result (the ciphertext C) is the remainder when Me is divided by n.
要解密密文,请将其提高到另一个d次方,再次对n取模。因此,加密和解密算法E和D为:
To decrypt the ciphertext, raise it to another power d, again modulo n. The encryption and decryption algorithms E and D are thus:
请注意,加密不会增加消息的大小;而是会增加消息的大小。消息和密文都是 0 到n -1 范围内的整数。
Note that encryption does not increase the size of a message; both the message and the ciphertext are integers in the range 0 to n − 1.
因此,加密密钥是一对正整数 ( e, n )。类似地,解密密钥是一对正整数(d,n)。每个用户将其加密密钥公开,并将相应的解密密钥保密。(这些整数应该像n A、e A和d A一样正确地加上下标,因为每个用户都有自己的集合。但是,我们将只考虑典型的集合,并将省略下标。)
The encryption key is thus the pair of positive integers (e, n). Similarly, the decryption key is the pair of positive integers (d, n). Each user makes his encryption key public, and keeps the corresponding decryption key private. (These integers should properly be subscripted as in nA, eA, and dA, since each user has his own set. However, we will only consider a typical set, and will omit the subscripts.)
如果您想使用我们的方法,您应该如何选择加密和解密密钥?
How should you choose your encryption and decryption keys, if you want to use our method?
首先将n计算为两个素数p和q的乘积:n = p · q。这些素数是非常大的“随机”素数。尽管您将公开n ,但由于因式分解n的巨大困难,因子p和q将对其他人有效地隐藏。这也隐藏了从e导出d的方式。
You first compute n as the product of two primes p and q: n = p · q. These primes are very large, “random” primes. Although you will make n public, the factors p and q will be effectively hidden from everyone else due to the enormous difficulty of factoring n. This also hides the way d can be derived from e.
然后,您选择整数d作为一个大的随机整数,它与 ( p − 1) · ( q − 1) 互质。即,检查d是否满足 gcd( d, ( p − 1) · ( q − 1)) = 1 (“gcd”表示“最大公约数”)。
You then pick the integer d to be a large, random integer which is relatively prime to (p− 1) · (q − 1). That is, check that d satisfies gcd(d, (p− 1) · (q − 1)) = 1 (“gcd” means “greatest common divisor”).
整数e最终根据p、q和d计算得出,作为d的“乘法逆元”,以 ( p − 1) · ( q − 1)为模。因此我们有e · d == 1 (mod ( p − 1) · ( q − 1))。
The integer e is finally computed from p, q, and d to be the “multiplicative inverse” of d, modulo (p− 1) · (q − 1). Thus we have e·d ≡ 1 (mod (p− 1) · (q − 1)).
我们在下一节中证明这保证了( 45.1 )和( 45.2 )成立,即E和D是逆排列。§ 45.7显示了如何有效地完成上述每个操作。
We prove in the next section that this guarantees that (45.1) and (45.2) hold, i.e. that E and D are inverse permutations. §45.7 shows how each of the above operations can be done efficiently.
上述方法不应与 Diffie 和 Hellman(1976a,此处第 42 章)提出的解决密钥分配问题的“求幂”技术相混淆。他们的技术允许两个用户确定在正常密码系统中使用的共同密钥。它不是基于活板门单向排列。Pohlig 和 Hellman (1978) 研究了一个与我们的方案相关的方案,其中求幂是以素数为模进行的。
The aforementioned method should not be confused with the “exponentiation” technique presented by Diffie and Hellman (1976a, here chapter 42) to solve the key distribution problem. Their technique permits two users to determine a key in common to be used in a normal cryptographic system. It is not based on a trap-door one-way permutation. Pohlig and Hellman (1978) study a scheme related to ours, where exponentiation is done modulo a prime number.
我们使用欧拉和费马 (Niven, 1972) 的恒等式证明了解密算法的正确性:对于任何与n互质的整数(消息) M,
We demonstrate the correctness of the deciphering algorithm using an identity due to Euler and Fermat (Niven, 1972): for any integer (message) M which is relatively prime to n,
这里phi ( n ) 是欧拉 totient 函数,给出小于n且与 n互质的正整数的数量。对于素数p,phi ( p ) = p − 1。在我们的例子中,我们有 totient 函数的基本属性:
Here ϕ(n) is the Euler totient function giving the number of positive integers less than n which are relatively prime to n. For prime numbers p, ϕ(p) = p − 1. In our case, we have by elementary properties of the totient function:
由于d与phi ( n )互质,因此它在以 phi ( n )为模的整数环中具有乘法逆元e:
Since d is relatively prime to ϕ(n), it has a multiplicative inverse e in the ring of integers modulo ϕ(n):
现在我们证明方程 ( 45.1 ) 和 ( 45.2 ) 成立(也就是说,如果按上述方式选择e和d ,则解密工作正确)。现在
We now prove that equations (45.1) and (45.2) hold (that is, that deciphering works correctly if e and d are chosen as above). Now
和
and
从 ( 45.3 ) 我们看到,对于所有M,使得p不能整除M,M p −1 ≡ 1 (mod p ),并且由于 ( p − 1) 整除phi ( n ),
From (45.3) we see that for all M such that p does not divide M, Mp−1 ≡ 1 (mod p), and since (p − 1) divides ϕ(n),
当M == 0 (mod p ) 时,这显然是正确的,因此这个等式实际上对所有 M 都成立。类似地争论q的产量
This is trivially true when M ≡ 0 (mod p), so that this equality actually holds for all M. Arguing similarly for q yields
最后两个方程一起意味着对于所有M,
Together these last two equations imply that for all M,
这意味着对于所有M , ( 45.1 ) 和 ( 45.2 ) 0 ≤ M < n。因此E和D是逆排列。(我们感谢 Rich Schroeppel 提出了作者先前证明的上述改进版本。)
This implies (45.1) and (45.2) for all M, 0 ≤ M < n. Therefore E and D are inverse permutations. (We thank Rich Schroeppel for suggesting the above improved version of the authors’ previous proof.)
为了证明我们的方法是实用的,我们为每个所需的操作描述了一个有效的算法。
To show that our method is practical, we describe an efficient algorithm for each required operation.
1. 令e k e k −1 … e 1 e 0为e的二进制表示。
1. Let ekek−1…e1e0 be the binary representation of e.
2. 将变量C设置为 1。
2. Set the variable C to 1.
3. 对于i = k , k − 1, … , 0重复步骤 3a 和 3b :
3. Repeat steps 3a and 3b for i = k, k − 1, …, 0:
3a. 将C设置为C 2除以n后的余数。
3a. Set C to the remainder of C2 when divided by n.
3b. 如果e i = 1,则将C设置为C · M除以n后的余数。
3b. If ei = 1, then set C to the remainder of C · M when divided by n.
4. 停下来。现在C是M的加密形式。
4. Halt. Now C is the encrypted form of M.
这个过程称为“通过重复平方和乘法求幂”。这个程序的效果是最好的程序的一半;更有效的程序是已知的。Knuth (1969) 详细研究了这个问题。
This procedure is called “exponentiation by repeated squaring and multiplication.” This procedure is half as good as the best; more efficient procedures are known. Knuth (1969) studies this problem in detail.
加密和解密相同的事实导致实现简单。(整个操作可以在少数专用集成电路芯片上实现。)
The fact that the enciphering and deciphering are identical leads to a simple implementation. (The whole operation can be implemented on a few special-purpose integrated circuit chips.)
高速计算机可以在几秒钟内加密200位的消息M ;专用硬件会更快。每个块的加密时间的增长速度不会快于n中位数的立方。
A high-speed computer can encrypt a 200-digit message M in a few seconds; special-purpose hardware would be much faster. The encryption time per block increases no faster than the cube of the number of digits in n.
要查找 100 位“随机”素数,请生成(奇数)100 位随机数,直到找到素数。根据素数定理(Niven,1972),在找到素数之前,大约需要测试(ln10 100 )/2 = 115 个数字。
To find a 100-digit “random” prime number, generate (odd) 100-digit random numbers until a prime number is found. By the prime number theorem (Niven, 1972), about (ln10100)/2 = 115 numbers will be tested before a prime is found.
为了测试大量b 的素性,我们推荐 Solovay 和 Strassen (1977) 提出的优雅的“概率”算法。它从 {1 , … , b − 1}上的均匀分布中选取一个随机数a,并测试是否
To test a large number b for primality we recommend the elegant “probabilistic” algorithm due to Solovay and Strassen (1977). It picks a random number a from a uniform distribution on {1, …, b − 1}, and tests whether
其中J ( a, b ) 是雅可比符号 (Niven, 1972)。如果b是素数 ( 45.5 ) 则始终为真。如果b是合数 ( 45.5 ) 至少有概率为假。如果 ( 45.5 ) 对于 100 个随机选择的a值成立,则b几乎肯定是素数;b是合数的可能性为 2 100分之一(可忽略不计) 。即使在我们的系统中意外使用了复合材料,接收器也可能会通过注意到解密无法正常工作来检测到这一点。当b为奇数、a ≤ b且 gcd( a, b ) = 1 时,雅可比符号J ( a, b ) 的值为 {−1, 1},并且可以由程序高效计算:
where J(a, b) is the Jacobi symbol (Niven, 1972). If b is prime (45.5) is always true. If b is composite (45.5) will be false with probability at least . If (45.5) holds for 100 randomly chosen values of a then b is almost certainly prime; there is a (negligible) chance of one in 2100 that b is composite. Even if a composite were accidentally used in our system, the receiver would probably detect this by noticing that decryption didn’t work correctly. When b is odd, a ≤ b, and gcd(a, b) = 1, the Jacobi symbol J(a, b) has a value in {−1, 1} and can be efficiently computed by the program:
( J ( a, b ) 和 gcd( a, b )的计算也可以很好地组合起来。)请注意,该算法不会通过尝试因式分解来测试数字的素数。Miller (1976) 给出了其他用于测试大量素数的有效程序;波拉德(1974);拉宾(1976)。
(The computations of J(a, b) and gcd(a, b) can be nicely combined, too.) Note that this algorithm does not test a number for primality by trying to factor it. Other efficient procedures for testing a large number for primality are given in Miller (1976); Pollard (1974); Rabin (1976).
为了针对复杂的因式分解算法获得额外的保护,p和q的长度应相差几位,( p − 1) 和 ( q − 1) 都应包含大质因数,并且 gcd( p − 1 , q − 1)应该很小。后一种情况很容易检查。
To gain additional protection against sophisticated factoring algorithms, p and q should differ in length by a few digits, both (p− 1) and (q − 1) should contain large prime factors, and gcd(p − 1, q − 1) should be small. The latter condition is easily checked.
要找到一个使 ( p − 1) 具有大素因子的素数p,生成一个大的随机素数u,然后令p为序列i · u + 1 中的第一个素数,对于i = 2, 4, 6 、…… (这不会花费太长时间。)通过确保 ( u − 1) 也有一个大的质因数来提供额外的安全性。
To find a prime number p such that (p − 1) has a large prime factor, generate a large random prime number u, then let p be the first prime in the sequence i·u + 1, for i = 2, 4, 6, …. (This shouldn’t take too long.) Additional security is provided by ensuring that (u − 1) also has a large prime factor.
高速计算机可以在几秒钟内确定一个 100 位数字是否是素数,并且可以在一两分钟内找到给定点之后的第一个素数。
A high-speed computer can determine in several seconds whether a 100-digit number is prime, and can find the first prime after a given point in a minute or two.
查找大素数的另一种方法是对多个已知的因式分解进行加一,然后测试结果的素数。如果找到素数p,则可以通过使用p − 1 的因式分解来证明它确实是素数。我们省略对此的讨论,因为概率方法已经足够了。
Another approach to finding large prime numbers is to take a number of known factorization, add one to it, and test the result for primality. If a prime p is found it is possible to prove that it really is prime by using the factorization of p − 1. We omit a discussion of this since the probabilistic method is adequate.
如果e结果小于 log 2 ( n ),则通过选择另一个d值重新开始。这保证了每个加密消息(除了M = 0 或M = 1)都会经历一些“环绕”(减少模n)。……
If e turns out to be less than log2(n), start over by choosing another value of d. This guarantees that every encrypted message (except M = 0 or M = 1) undergoes some “wrap-around” (reduction modulo n). …
由于没有技术可以证明加密方案是安全的,因此唯一可用的测试就是看看是否有人能想出破解它的方法。国家统计局的标准就是这样“认证”的;在IBM花了十七个人年的时间试图打破这个计划,但毫无结果。一旦一种方法成功地抵御了这种协同攻击,出于实际目的,它就可以被认为是安全的。(实际上,关于 NBS 方法的安全性存在一些争议 [Diffie 和 Hellman,1977]。)
Since no techniques exist to prove that an encryption scheme is secure, the only test available is to see whether anyone can think of a way to break it. The NBS standard was “certified” this way; seventeen man-years at IBM were spent fruitlessly trying to break that scheme. Once a method has successfully resisted such a concerted attack it may for practical purposes be considered secure. (Actually there is some controversy concerning the security of the NBS method [Diffie and Hellman, 1977].)
我们将在接下来的部分中展示,所有破坏我们系统的明显方法至少与分解n一样困难。尽管事实证明对大数进行因式分解并不困难,但这是一个众所周知的问题,许多著名数学家在过去三百年中一直在研究这个问题。Fermat (1601?–1665) 和 Legendre (1752–1833) 开发了因式分解算法;当今一些更有效的算法都是基于勒让德的工作。然而,正如我们将在下一节中看到的,尚未有人找到一种可以在合理的时间内分解 200 位数字的算法。我们得出的结论是,我们的系统已经通过之前寻找有效因式分解算法的努力得到了部分“认证”。
We show in the next sections that all the obvious approaches for breaking our system are at least as difficult as factoring n. While factoring large numbers is not provably difficult, it is a well-known problem that has been worked on for the last three hundred years by many famous mathematicians. Fermat (1601?–1665) and Legendre (1752–1833) developed factoring algorithms; some of today’s more efficient algorithms are based on the work of Legendre. As we shall see in the next section, however, no one has yet found an algorithm which can factor a 200-digit number in a reasonable amount of time. We conclude that our system has already been partially “certified” by these previous efforts to find efficient factoring algorithms.
在以下部分中,我们考虑密码分析者可能尝试从公开披露的加密密钥中确定秘密解密密钥的方法。我们不考虑保护解密密钥不被盗窃的方法;通常的物理安全方法就足够了。(例如,加密设备可以是一个单独的设备,也可以用于生成加密和解密密钥,这样解密密钥就不会被打印出来(即使对于其所有者也是如此),而仅用于解密消息。该设备可以如果解密密钥被篡改,请删除它。)
In the following sections we consider ways a cryptanalyst might try to determine the secret decryption key from the publicly revealed encryption key. We do not consider ways of protecting the decryption key from theft; the usual physical security methods should suffice. (For example, the encryption device could be a separate device which could also be used to generate the encryption and decryption keys, such that the decryption key is never printed out (even for its owner) but only used to decrypt messages. The device could erase the decryption key if it was tampered with.)
我们认为这种方法并不比因式分解n更容易,因为它使密码分析者能够使用phi ( n ) 轻松地因式分解n。这种分解n的方法尚未被证明是实用的。
We argue that this approach is no easier than factoring n since it enables the cryptanalyst to easily factor n using ϕ(n). This approach to factoring n has not turned out to be practical.
如何使用phi ( n ) 对n进行因式分解?首先,根据n和phi ( n ) = n − ( p + q ) + 1得到( p + q ) 。然后 ( p − q ) 是 ( p + q ) 2 − 4 n的平方根。最后,q是 ( p + q ) 和 ( p − q )之差的一半。
How can n be factored using ϕ(n)? First, (p + q) is obtained from n and ϕ(n) = n− (p + q) + 1. Then (p−q) is the square root of (p + q)2 − 4n. Finally, q is half the difference of (p + q) and (p − q).
因此,通过计算phi ( n )来破坏我们的系统并不比通过因式分解n来破坏我们的系统容易。(这就是为什么n必须是合数;如果n是素数,则计算phi ( n ) 是微不足道的。)
Therefore breaking our system by computing ϕ(n) is no easier than breaking our system by factoring n. (This is why n must be composite; ϕ(n) is trivial to compute if n is prime.)
我们认为,对于密码分析者来说,计算d并不比分解n更容易,因为一旦知道d,就可以轻松分解n 。这种保理方法也没有取得成果。
We argue that computing d is no easier for a cryptanalyst than factoring n, since once d is known n could be factored easily. This approach to factoring has also not turned out to be fruitful.
知道d后, n就可以按如下方式因式分解。一旦密码分析者知道d,他就可以计算e · d − 1,它是phi ( n )的倍数。Miller 证明n可以使用phi ( n )的任意倍数进行因式分解。因此,如果n很大,密码分析者确定d 的速度应该不会比分解n更容易。
A knowledge of d enables n to be factored as follows. Once a cryptanalyst knows d he can calculate e · d − 1, which is a multiple of ϕ(n). Miller has shown that n can be factored using any multiple of ϕ(n). Therefore if n is large a cryptanalyst should not be able to determine d any easier than he can factor n.
密码分析者可能希望找到d ',它相当于公钥密码系统的用户秘密持有的d 。如果这样的值d ' 很常见,那么强力搜索可能会破坏系统。然而,所有此类d ' 的差异在于 ( p − 1)的最小公倍数和( q − 1),找到 1 就可以对n进行因式分解。(在(45.3)和(45.4)中,phi ( n ) 可以用 lcm( p − 1 , q − 1)代替。)因此,找到任何这样的d ′ 与因式分解n一样困难。
A cryptanalyst may hope to find a d′ which is equivalent to the d secretly held by a user of the public-key cryptosystem. If such values d′ were common then a brute-force search could break the system. However, all such d′ differ by the least common multiple of (p− 1) and (q − 1), and finding one enables n to be factored. (In (45.3) and (45.4), ϕ(n) can be replaced by lcm(p− 1, q − 1).) Finding any such d′ is therefore as difficult as factoring n.
我们的方法应该通过让上述难以处理的猜想经受住一致的尝试来证明它是正确的。读者面临的挑战是找到一种方法来“打破”我们的方法。
Our method should be certified by having the above conjecture of intractability withstand a concerted attempt to disprove it. The reader is challenged to find a way to “break” our method.
签名消息可能必须“重新阻止”以进行加密,因为签名n可能大于加密n(每个用户都有自己的n)。这可以通过如下方式避免。为公钥密码系统选择阈值h (假设h = 10 199 )。每个用户维护两对公共 ( e, n ) 对,一对用于加密,一对用于签名验证,其中每个签名n都小于h,并且每个加密n都大于h。这样就不需要重新阻止来加密签名消息了;根据发送者的签名n来阻止消息。
A signed message may have to be “reblocked” for encryption since the signature n may be larger than the encryption n (every user has his own n). This can be avoided as follows. A threshold value h is chosen (say h = 10199) for the public-key cryptosystem. Every user maintains two public (e, n) pairs, one for enciphering and one for signature-verification, where every signature n is less than h, and every enciphering n is greater than h. Reblocking to encipher a signed message is then unnecessary; the message is blocked according to the transmitter’s signature n.
另一种解决方案使用 Levine 和 Brawley (1977) 中给出的技术。每个用户都有一个 ( e, n ) 对,其中n介于h和 2 h之间,其中h是上述阈值。消息被编码为小于h的数字并像以前一样进行加密,只是如果密文大于h,则会重复重新加密,直到它小于h。类似地,对于解密,密文被重复解密以获得小于h的值。如果n接近h,则重新加密将很少发生。(无限循环是不可能的,因为最坏的情况是消息本身被加密。)
Another solution uses a technique given in Levine and Brawley (1977). Each user has a single (e, n) pair where n is between h and 2h, where h is a threshold as above. A message is encoded as a number less than h and enciphered as before, except that if the ciphertext is greater than h, it is repeatedly re-enciphered until it is less than h. Similarly for decryption the ciphertext is repeatedly deciphered to obtain a value less than h. If n is near h re-enciphering will be infrequent. (Infinite looping is not possible, since at worst a message is enciphered as itself.)
我们提出了一种实现公钥密码系统的方法,其安全性部分取决于分解大数的难度。如果我们的方法的安全性被证明是足够的,那么它允许在不使用信使携带密钥的情况下建立安全通信,并且还允许人们“签署”数字化文档。
We have proposed a method for implementing a public-key cryptosystem whose security rests in part on the difficulty of factoring large numbers. If the security of our method proves to be adequate, it permits secure communications to be established without the use of couriers to carry keys, and it also permits one to “sign” digitized documents.
需要更详细地检查该系统的安全性。特别是,应仔细研究分解大数的困难。敦促读者找到一种“打破”系统的方法。一旦该方法在足够长的时间内抵御了所有攻击,就可以以合理的信心使用它。
The security of this system needs to be examined in more detail. In particular, the difficulty of factoring large numbers should be examined very closely. The reader is urged to find a way to “break” the system. Once the method has withstood all attacks for a sufficient length of time it may be used with a reasonable amount of confidence.
我们的加密函数是作者所知的“活板门单向排列”的唯一候选函数。如果有一天我们系统的安全性被证明是不够的,可能需要找到其他示例,以提供替代实现。当然,这些功能还有许多新的应用有待发现。
Our encryption function is the only candidate for a “trap-door one-way permutation” known to the authors. It might be desirable to find other examples, to provide alternative implementations should the security of our system turn out someday to be inadequate. There are surely also many new applications to be discovered for these functions.
转载自 Rivest 等人。(1978),经计算机协会许可。
Reprinted from Rivest et al. (1978), with permission from the Association for Computing Machinery.
随着计算机联网、信息传播和交流,保密和隐私开始呈现出在存储纸质文档和集中数据库的世界中并不重要的维度。有趣的隐喻——本文基于 1968 年一本数学书中提出的问题——在网络信息世界中具有重要意义。密码学的进步(第 42 章和第 45 章)承诺(尚未经过数学验证)普通人可以在远距离进行保密通信,并相信如果不进行不切实际的大量计算工作,他们的秘密通信就不会受到损害。相比之下,本文描述了一种在各方之间共享秘密的协议,各方必须在一定程度上进行合作才能恢复秘密——事实证明,较小的各方阴谋集团不可能揭露秘密,甚至投入无限的资源。由这篇简短的贡献发起的秘密共享领域现已发展到包括数千篇论文。
As computers were networked and information was spread around and communicated, secrecy and privacy began to assume dimensions that had not been significant in a world of stored paper documents and centralized data banks. Playful metaphors—this paper is based on a problem posed in a 1968 mathematics book—assumed grave significance in the world of networked information. Advances in cryptography (chapters 42 and 45) held out the promise (not yet mathematically verified) that ordinary people might communicate confidentially at a distance, confident that their secret communication could not be compromised without an unrealistically large level of computational effort. This paper, by contrast, describes a protocol for sharing a secret among parties who must, to a degree that can be stipulated, cooperate in order to recover it—in such a way that it is provably impossible for a smaller cabal of the parties to expose the secret, even by devoting unlimited resources to the effort. The field of secret sharing initiated with this one short contribution has now grown to include many thousands of papers.
我们在第 45 章中认识了 Adi Shamir(生于 1952 年),他是 RSA 公钥密码系统的作者之一。本文首先提出了一个看似不可能的挑战,并仅使用高中数学提出了一个漂亮、简短的解决方案。这实际上是一个关于分布式计算机系统的问题,但没有提到计算机技术本身。计算机科学在 20 世纪 80 年代蓬勃发展,该领域仍然充满了像这样的简单问题,亟待优雅的解决方案。
We met Adi Shamir (b. 1952) in chapter 45 as one of the authors of the RSA public-key cryptosystem. This paper starts by posing an apparently impossible challenge, and lays out a beautiful, short solution using only high school mathematics. It is practically a problem about distributed computer systems, yet mentions nothing about computer technology itself. Computer science exploded in the 1980s, and the field remains full of simply stated problems like this one, begging for elegant solutions.
L IU (1968) 考虑了以下问题:十一位科学家正在研究一个秘密项目。他们希望将文件锁在柜子里,这样当且仅当六名或更多科学家在场时才能打开柜子。最少需要多少个锁?每个科学家必须携带的最少数量的锁钥匙是多少?
LIU (1968) considers the following problem: Eleven scientists are working on a secret project. They wish to lock up the documents in a cabinet so that the cabinet can be opened if and only if six or more of the scientists are present. What is the smallest number of locks needed? What is the smallest number of keys to the locks each scientist must carry?
不难证明,最小解决方案每位科学家使用 462 个锁和 252 个钥匙。这些数字显然是不切实际的,而且当科学家数量增加时,情况会呈指数级恶化。在本文中,我们将问题推广到其中秘密是某个的问题数据D(例如,安全组合)并且其中还允许非机械解决方案(操纵该数据)。我们的目标是将D分为n 个部分D 1 , … , D n ,方式如下:
It is not hard to show that the minimal solution uses 462 locks and 252 keys per scientist. These numbers are clearly impractical, and they become exponentially worse when the number of scientists increases. In this paper we generalize the problem to one in which the secret is some data D (e.g., the safe combination) and in which nonmechanical solutions (which manipulate this data) are also allowed. Our goal is to divide D into n pieces D1, …, Dn in such a way that:
1. 任何k 个或更多D i块的知识使得D易于计算;
1. knowledge of any k or more Di pieces makes D easily computable;
2. 任何k − 1 或更少的D i块的知识使得D完全不确定(从某种意义上说,它的所有可能值都是同等可能的)。
2. knowledge of any k − 1 or fewer Di pieces leaves D completely undetermined (in the sense that all its possible values are equally likely).
这种方案称为( k,n )阈值方案。有效的阈值方案对于加密密钥的管理非常有帮助。为了保护数据,我们可以对其进行加密,但是为了保护加密密钥,我们需要一种不同的方法(进一步的加密会改变问题而不是解决问题)。最安全的密钥管理方案将密钥保存在一个受到严密保护的单一位置(计算机、人脑或保险箱)。这种方案非常不可靠,因为一次不幸(计算机故障、突然死亡或破坏)就可能导致信息无法访问。一个显而易见的解决方案是在不同位置存储密钥的多个副本,但这会增加安全漏洞(计算机渗透、背叛或人为错误)的危险。通过使用n = 2 k − 1 的 ( k, n ) 阈值方案,我们得到了一个非常鲁棒的密钥管理方案:即使n个片段中的⌊ n/ 2 ⌋ = k − 1被破坏,我们也可以恢复原始密钥,但即使安全漏洞暴露了剩余k个片段中的⌊ n/ 2 ⌋ = k − 1 ,我们的对手也无法重建密钥。
Such a scheme is called a (k, n) threshold scheme. Efficient threshold schemes can be very helpful in the management of cryptographic keys. In order to protect data we can encrypt it, but in order to protect the encryption key we need a different method (further encryptions change the problem rather than solve it). The most secure key management scheme keeps the key in a single, well-guarded location (a computer, a human brain, or a safe). This scheme is highly unreliable since a single misfortune (a computer breakdown, sudden death, or sabotage) can make the information inaccessible. An obvious solution is to store multiple copies of the key at different locations, but this increases the danger of security breaches (computer penetration, betrayal, or human errors). By using a (k, n) threshold scheme with n = 2k − 1 we get a very robust key management scheme: We can recover the original key even when ⌊n/2⌋ = k − 1 of the n pieces are destroyed, but our opponents cannot reconstruct the key even when security breaches expose ⌊n/2⌋ = k − 1 of the remaining k pieces.
在其他应用中,权衡不是在保密性和可靠性之间,而是在安全性和使用便利性之间。例如,考虑一家对其所有支票进行数字签名的公司(Rivest 等,1978)。如果每位高管都获得一份公司秘密签名密钥的副本,该系统很方便,但很容易被滥用。如果需要公司所有高管的合作才能签署每张支票,则系统安全但不方便。标准解决方案每次检查至少需要三个签名,并且使用 (3 , n ) 阈值方案很容易实现。每位高管都会获得一张带有一个D i片的小磁卡,公司的签名生成设备接受其中的任意三张磁卡,以便生成(并随后销毁)实际签名密钥D的临时副本。该设备不包含任何秘密信息,因此不需要受到检查保护。一名不忠的高管必须至少有两名同谋才能在该计划中伪造公司签名。
In other applications the tradeoff is not between secrecy and reliability, but between safety and convenience of use. Consider, for example, a company that digitally signs all its checks (Rivest et al., 1978). If each executive is given a copy of the company’s secret signature key, the system is convenient but easy to misuse. If the cooperation of all the company’s executives is necessary in order to sign each check, the system is safe but inconvenient. The standard solution requires at least three signatures per check, and it is easy to implement with a (3, n) threshold scheme. Each executive is given a small magnetic card with one Di piece, and the company’s signature generating device accepts any three of them in order to generate (and later destroy) a temporary copy of the actual signature key D. The device does not contain any secret information and thus it need not be protected against inspection. An unfaithful executive must have at least two accomplices in order to forge the company’s signature in this scheme.
阈值方案非常适合具有利益冲突的一组相互怀疑的个人必须合作的应用。理想情况下,我们希望合作建立在双方同意的基础上,但这种机制赋予每个成员的否决权可能会瘫痪该组织的活动。通过正确选择k和n参数,我们可以赋予任何足够大的多数人采取某些行动的权力,同时赋予任何足够大的少数人阻止它的权力。
Threshold schemes are ideally suited to applications in which a group of mutually suspicious individuals with conflicting interests must cooperate. Ideally we would like the cooperation to be based on mutual consent, but the veto power this mechanism gives to each member can paralyze the activities of the group. By properly choosing the k and n parameters we can give any sufficiently large majority the authority to take some action while giving any sufficiently large minority the power to block it.
我们的方案基于多项式插值:给定二维平面 ( x 1 , y 1 ), … , ( x k , y k ) 中具有不同xi的k个点,有一个且仅有一个多项式q ( x ) 的k − 1度,使得对于所有i而言q ( x i ) = y i。不失一般性,我们可以假设数据D是(或可以制作)一个数字。为了将其分成D i部分,我们选择一个随机k − 1 次多项式q ( x ) = a 0 + a 1 x + … a k −1 x k −1其中a 0 = D,并评估:
Our scheme is based on polynomial interpolation: given k points in the 2-dimensional plane (x1, y1), …, (xk, yk) with distinct xi’s, there is one and only one polynomial q(x) of degree k − 1 such that q(xi) = yi for all i. Without loss of generality, we can assume that the data D is (or can be made) a number. To divide it into pieces Di, we pick a random k − 1 degree polynomial q(x) = a0 + a1x + …ak−1xk−1 in which a0 = D, and evaluate:
给定这些Di值的k的任何子集(连同它们的标识索引),我们可以通过插值找到q ( x ) 的系数,然后评估D = q (0)。另一方面,仅了解这些值中的k − 1 个并不足以计算D。
Given any subset of k of these Di values (together with their identifying indices), we can find the coefficients of q(x) by interpolation, and then evaluate D = q(0). Knowledge of just k − 1 of these values, on the other hand, does not suffice in order to calculate D.
为了使这一说法更加精确,我们使用模算术而不是实数算术。以素数p为模的整数集形成一个可以进行插值的域。给定一个整数值数据D,我们选择一个比D和n都大的素数p。q ( x ) 中的系数a 1 , … , a k -1是从 [0 , p )中的整数上的均匀分布中随机选择的,并且值D 1 , … , D n是对p取模计算的。
To make this claim more precise, we use modular arithmetic instead of real arithmetic. The set of integers modulo a prime number p forms a field in which interpolation is possible. Given an integer valued data D, we pick a prime p which is bigger than both D and n. The coefficients a1, …, ak−1 in q(x) are randomly chosen from a uniform distribution over the integers in [0, p), and the values D1, …, Dn are computed modulo p.
现在让我们假设这n件棋子中的k − 1件被揭示给对手。对于[0 , p ) 中的每个候选值D ′ ,他可以构造一个且仅有一个k − 1 次多项式q ′( x ) ,使得q ′(0) = D ′ 且q ′( i ) = Di k − 1 个给定参数。通过构造,这p 个可能的多项式的可能性相同,因此对手绝对无法推断出D的真实值。
Let us now assume that k − 1 of these n pieces are revealed to an opponent. For each candidate value D′ in [0, p) he can construct one and only one polynomial q′(x) of degree k − 1 such that q′(0) = D′ and q′(i) = Di for the k − 1 given arguments. By construction, these p possible polynomials are equally likely, and thus there is absolutely nothing the opponent can deduce about the real value of D.
Aho 等人讨论了用于多项式求值和插值的高效O ( n log 2 n ) 算法。(1974) 和 Knuth (1997b),但即使是简单的二次算法对于实际的密钥管理方案来说也足够快。如果数字D很长,建议将其分成较短的位块(单独处理)以避免多精度算术运算。这些块不能任意短,因为p的最小可用值为n + 1([0 , p )中必须至少有n + 1 个不同的参数来计算q ( x ) at )。然而,这并不是一个严格的限制,因为 16 位模数(可以由廉价的 16 位算术单元处理)足以满足具有多达 64,000 个Di件的应用。
Efficient O(n log2 n) algorithms for polynomial evaluation and interpolation are discussed in Aho et al. (1974) and Knuth (1997b), but even the straightforward quadratic algorithms are fast enough for practical key management schemes. If the number D is long, it is advisable to break it into shorter blocks of bits (which are handled separately) in order to avoid multiprecision arithmetic operations. The blocks cannot be arbitrarily short, since the smallest usable value of p is n + 1 (there must be at least n + 1 distinct arguments in [0, p) to evaluate q(x) at). However, this is not a severe limitation since sixteen bit modulus (which can be handled by a cheap sixteen bit arithmetic unit) suffices for applications with up to 64,000 Di pieces.
此 ( k, n ) 阈值方案(与机械锁和钥匙解决方案相比)的一些有用属性是:
Some of the useful properties of this (k, n) threshold scheme (when compared to the mechanical locks and keys solutions) are:
1、每块的大小不超过原始数据的大小。
1. The size of each piece does not exceed the size of the original data.
2.当k保持固定时,D i件可以动态添加或删除(例如,当高管加入或离开公司时),而不影响其他D i件。(只有当离职的高管完全无法访问某篇文章时,甚至他自己也无法访问该文章,才会将其删除。)
2. When k is kept fixed, Di pieces can be dynamically added or deleted (e.g., when executives join or leave the company) without affecting the other Di pieces. (A piece is deleted only when a leaving executive makes it completely inaccessible, even to himself.)
3.在不改变原始数据D的情况下,很容易改变D i部分——我们所需要的只是一个具有相同自由项的新多项式q ( x ) 。这种类型的频繁更改可以极大地增强安全性,因为安全漏洞暴露的部分无法累积,除非它们都是 q ( x )多项式的同一版本的值。
3. It is easy to change the Di pieces without changing the original data D—all we need is a new polynomial q(x) with the same free term. A frequent change of this type can greatly enhance security since the pieces exposed by security breaches cannot be accumulated unless all of them are values of the same edition of the q(x) polynomial.
4. 通过使用多项式值的元组作为Di块,我们可以得到一个分层方案,其中确定D所需的块数取决于它们的重要性。例如,如果我们为公司总裁提供三个q ( x ) 值,每位副总裁提供两个q ( x ) 值,每位高管提供一个q ( x ) 值,则 (3 , n ) 阈值方案可实现检查由任何三名高管签署,或由任何两名高管(其中一名副总裁)签署,或由总裁单独签署。
4. By using tuples of polynomial values as Di pieces, we can get a hierarchical scheme in which the number of pieces needed to determine D depends on their importance. For example, if we give the company’s president three values of q(x), each vice-president two values of q(x), and each executive one value of q(x), then a (3, n) threshold scheme enables checks to be signed either by any three executives, or by any two executives one of whom is a vice-president, or by the president alone.
多项式可以替换为任何其他易于评估和插值的函数集合。GR Blakley (1979) 最近开发了一种不同的(效率稍低)阈值方案。
The polynomials can be replaced by any other collection of functions which are easy to evaluate and to interpolate. A different (and somewhat less efficient) threshold scheme was recently developed by G. R. Blakley (1979).
经计算机协会许可,转载自 Shamir (1979)。
Reprinted from Shamir (1979), with permission from the Association for Computing Machinery.